100 Matching Annotations
  1. Last 7 days
    1. ABSTRACT

      Very exciting technology, great work to the authors!

    2. PERC is gentle on cells, permitting sequential editing of multiple loci. As previously reported, this is one way to minimize chromosomal translocations1

      This is a really exciting implication that I hadn't considered before!

    3. Fig. 2 and Fig. 3

      I found this figures a little difficult to fully understand. Here are my small notes about what would improve the presentation, again, please take it or leave it:

      1. Colors: It would be helpful to have a legend that explains that different colors uses (light blue, dark blue, white, etc) . It took me a while to see the triangle vs. circle for washed and unwashed, but im not sure how the colors connect.

      2. Some stats would be helpful here! It can be difficult to asses the differences just by eye. It seems that sometimes washed vs. unwashed are different in terms of edited cell yield (like in the HSPCs) but then are the same for other metrics like % editing and % NHEJ? It would be useful to in the figure have some comparisons and indications of if the differences are statistically significant.

      3. This might just be a biorxiv figure display issue, but in 2c, 2f, 3f, 4c, and 4f the white NT bars are missing some of their outlines.

    4. INTRODUCTION

      This is a really helpful introduction to the technology. I appreciate the level of detail provided here, its clear the authors are being very thoughtful about enabling others to use this approach. Overall I found the description of the technology to clear and rigorous. I left some comments about details that I would wonder as a non-expert user if i were trying to get something similar off the ground, please take it or leave it.

    5. The peptide-only condition can be used to test a given cell type’s sensitivity to the peptide, although we note that peptide-mediated toxicity can be exacerbated by the absence of RNP cargo.

      How do you measure cell sensitivity? Is this overall viability, or are there other important metrics to consider?

    6. Assessing editing efficiency

      I really appreciate your step-by-step breakdown of how to evaluate editing success! It could be useful to also explain how to evaluate off-target editing. Do you have a routine approach for this?

    7. For synthesis, we recommend ≥95% purity as assessed by HPLC, which is also used for purification. It has been suggested that an acid exchange step (using HCl or acetate to displace trifluoroacetic acid)

      So helpful! thanks for including this detail.

    8. Two PERC peptides are commercially available: INF7TAT-A5K (A5K)15 and INF7TAT-P55 (P55).

      Where are they commercially available from?

    9. T cells and HSPCs are relatively fragile and generally resistant to transfection

      I can't tell if you are including LNPs under the transfection umbrella or not. I naively would assume yes but am not an expert. Below you talk about LNPs being a viable option for T cell / HSPC delivery, but this sentence up top is suggests otherwise. If you are referring to a different type of transfection reagents here being non-ideal or T cell/HSPC delivery , can you specify ?

    10. Spacing PERC delivery steps by ≥ 2 d allows each RNP to be metabolized by the cell26,

      Can you clarify if the cells are dividing during this time frame? Or if this is due to the overall stability of the RNP in the cell itself.

    11. NF7TAT peptide had served as the basis for a prior screen for lytic activity in red blood cells as a proxy for endosomal escape20. Our screen of INF7TAT variants in T cells15 identified A5K as well as three additional activating INF7TAT substitutions (G1K, G20L and Y22N) that we have now incorporated in a single peptide: INF7TAT-P55 (henceforth P55)

      Could you briefly explain a little bit more about your peptide reagent? How do the mutations impact activity?

      Also, why aren't the peptides lytic in this context? Do the mutations reduce this activity?

  2. Jun 2024
    1. Next, we investigated whether our strategy is also applicable in an animal model of wound infection

      Really impressive that you also did the mouse work to see if this held up in vivo!

    2. We obtained comparable results regardless of the MOI used, despite the sub-maximal initial inhibition effects at lower MOIs (Fig. 1A-iii).

      While this may be true for DMS3, it may not be universally true for some of the other phages you are using in this panel. For instance, phiKZ can only infect PA14 at high MOIs due to the activity of a jumbophage targetting immune system. Infection with KZ would certainly fail at some of these lower MOIs.

      Check out this preprint for more info: https://www.biorxiv.org/content/10.1101/2022.09.17.508391v1.full

    3. low MOI (MOI = 10)

      As a small note, I would not consider MOI = 10 to be low

    4. We observed that phages belonging to the same complementary group generally share common patterns of interactions with conventional antibiotics

      To me looking at these data, it appears that the patterns of synergy are being driven by the antibiotic more so than the phage receptor group for both S. aureus and PA14. I think theres some interesting stuff there with phage x antibiotic relationships (like why is Rif so antagonistic in S. aureus?) but I'm not convinced most of the signal is coming from the receptor group. I do think that there could be some cases where it would be - like when the receptor is an antibiotic efflux pump - but thats not the case for most of these phages.

      If you want to pursue the receptor x antibiotic angle, it could be interesting to test the susceptibility of the receptor KO strains to antibiotics. That might give you more direct information about how the phage receptor and the antibiotic may be influencing each other.

    5. While beyond the scope of this work, understanding the mechanisms underlying these patterns will be important for the future success of this approach

      Totally agree! i found this to be a really interesting part of your work.

  3. May 2024
    1. Table S1.

      This table is pretty hard to use in this current format, would you consider a text version of it , like Table S2-S5 ?

    2. These data were used to train a sequence-to-sequence gRNA model that conditionally generates crRNA and tracrRNA sequences for a given protein (Fig. 1a).

      This is so cool that this worked!

    3. After generating the full set of four million sequences, a series of filters were applied to ensure only realistic proteins were used to characterize the model’s generative capabilities.

      It would be really useful to see the numbers of sequences that these different steps filtered out in getting you from 4 mill --> 2 mill.

  4. Apr 2024
    1. Abstract

      This was a really interesting and fun read! Congrats to the authors.

    2. EPG has shown response to magnetic stimuli in mammalian cells in the form of increased intracellular calcium

      Has most of this work been done in Kryptopterus vitreolus? If so (and i make the same comment below) it would be very helpful to make it clear when you are talking about EPG generally vs results from EPG from KV specifically.

    3. 3F motif of EPG

      In this section it would be very helpful to label the "WT" EPG as EPG-KV (Kryptopterus vitreolus)

    4. It becomes clear that the F in positions 1 and 10 are highly conserved, but the F in position 7 has some variability.

      Interesting that function loss occurs only through loss of the central F in the FXF motif. If the selection is just for loss of function of the EPG protein, I would imagine that this could be achieved in many different ways such as mutating F in position 1 or 10, or by adding premature stop codons in the sequence. Do you think that EPG performs a alternative function that is being maintained, even while the magnetoreceptive abilities are being selected against?

    5. In this case, tblastn returned 62 unique species that express a protein with high sequence homology to EPG. An additional 10 species with EPG homologs in their genomes were either discovered in the EFISH Genomics database, or uncovered by manually searching unannotated genomes within NCBI databases of species in the same genus as others with an EPG homolog.

      It wasn't clear to me from this analysis what rough % of fish have a EPG homolog. Is EPG ever lost? If so, this could help out your trait association analysis by adding more fish that are functionally EPG negative.

  5. Mar 2024
    1. However, a comparison of the expression levels of defense genes in 48N and cured-48N showed that these defense systems, and additional ones, such as Hachiman, were strongly downregulated in the infected strain

      This is a really cool finding! We saw something conceptually similar when studying the transcriptional regulation of CRISPR-Cas systems in Pseudomonas aeruginosa. We found that numerous Pseudomonas phages had their own copies of a bacterial gene that was involved in CRISPR-Cas transcriptional repression, and these phage transcription factors were capable of down regulating expression of the CRISPR-Cas system in Pseudomonas. (see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7190418/)

      I'm wondering if you see any cases where these viruses encode host-like proteins? Perhaps that represents a similar case where the virus has stolen defense system regulators and is using them to control host gene expression.

    2. despite the lack of sequence similarity of any of the ORFs encoded by the provirus to any known capsid protein.

      Have you considered using structural homology searches to ID the capsid protein? In this paper (https://www.nature.com/articles/s41396-023-01474-1) they were able to use that approach assign function to archaea virus genes of unknown function, and successfully ID'ed a capsid gene.

  6. Feb 2024
    1. Whilst APC-366 did not impact MC degranulation (Fig. 7E-F), pericyte volume was maintained in the presence of the inhibitor (Fig. 7E and G).

      I like this last experiment with the specific tryptase inhibitor - and extra plus for doing it in human cells!

    2. general protease inhibitor

      Can you provide more detail about this general inhibitor? Is this a cocktail? Which mast cell proteases or classes of mast cell proteases (serine, cysteine, etc) would it be predicted to inhibited? Are there mast cell proteases that you believe would still be active in the presence of this inhibitor?

    3. loss of surface N-cadherin

      Is the idea that the mast cell proteases are directly cleaving N-cadherin, leading to its loss from the cell surface? Is it a known substrate for mast cell proteases?

    4. protease inhibitor cocktail (Merck)

      Can you add a product number for this cocktail, as well as a more complete description of what this cocktail is?

  7. Jan 2024
    1. choice of search engine used for text-based queries is ultimately up to the user and examples of the commonly used platforms or “engines” include but are not limited to PubMed, Google Scholar, and Europe PMC.

      I would love to see the results from searching on biorxiv!

    2. Protein Family Case Study and Literature Review, Curation

      Having a more complete methods section would be really beneficial here so others could recreate your search steps for your example protein, or for another protein of interest. It might be useful to frame more in a how-to guide/operating manual for learning about proteins. Right now there are a lot of interesting comparisons about the efficacy of different tools, but I think there is also an opportunity here to help onboard people onto these different tools by making your workflow and methodology more explicit and clear.

    3. The first step in any protein family analysis requires the gathering of input data (e.g., a sequence or an identifier) that will be used as seed information for queries (Fig. 1 and Fig. S1-2) . This process generates two master lists: 1) a list of identifiers, gene/protein names; and 2) a list of representative sequences. Protein family databases such as Pfam [25], InterPro [26], CDD [27], EggNOG [28] are essential tools in generating these two lists.

      it would be helpful here to be more clear about the methods and order of operations of what is actually being done to retrieve this information. For instance, is a gene/protein family name the initial input for search in these 4 databases? How is the input name selected? How is the output data downloaded and processed? What are the sanity checks to make sure your search is generating useful and on target information?

    4. Creating a Wiki compiling a non-exhaustive list of web-based resources organized into pedagogical modules for microbiologist

      This is such a great initiative! I'd like to highlight some of my fave tools here in case its useful:

      clinker my go too tool for generating gene neighborhood comparisions and figures (https://github.com/gamcil/clinker) - its command line but also looks like it can run through the CAGECAT webserver, though I have not tried that yet.

      viptree is a really useful webserver application for comparing viral genomes (ive only tried it with phages) and making trees. https://www.genome.jp/viptree/

      ProteinCartography is our in house tool (https://research.arcadiascience.com/pub/resource-protein-cartography/release/7) for pulling protein sequence and structural homologs, and visually exploring the data. Please check it out if it seems relevant, and tell us if it is or is not useful for you!

  8. Dec 2023
    1. To predict these interactions, we set out to define genomic traits with predictive power. We show that most interactions in our dataset can be explained by adsorption factors as opposed to antiphage systems which play a marginal role.

      This is a really useful and impressive effort, both for understanding basic phage biology as well as deploying phages therapeutically in the clinic. One thing that could be useful to consider adding to your genomic traits that you analyze here are endogenous prophages (and maybe even other MGEs). There are many documented ways that prophages can remodel the surface of the host and change absorption, as well have tons of interesting pro and anti immune system activities.

    2. Overall, these results show that bacterial defense systems are not required to predict the phage-bacteria interactions and can be removed from the set of candidate traits provided as input features to our models.

      Is one interpretation of this that anti-defense mechanisms are common, and so presence of a defense system is a poor predictor of phage sensitivity?

    3. However, this significance was lost whenever pairs of strains with a phylogenetic distance below 10-4 substitutions per position were removed (typically less than a few hundred SNPs on the whole core genome), indicating that the correlation was driven by very tightly related kins (Supplementary Figure 4). Our dataset shows that phylogeny poorly explains phage susceptibility.

      This is such a useful observation! I really appreciate that you performed this analysis with and without these highly related strains

  9. Nov 2023
    1. net growth advantage for the colonizer of at least 0.5 doublings per day

      You could consider using tools like iRep to directly calculate the replication rate of different organisms in the community: https://www.nature.com/articles/nbt.3704

      Overall maybe that could help explain some of the transmission and engraftment differences - does a strain need to be actively replicating in the host in order to be transmitted? Are persister (non repelicating) strains more likely to to rebound in the antibiotic treated condition?

    2. revious studies in mouse models have hypothesized that strain transmission from cohabiting individuals could contribute to these high levels of resident strain recovery

      I guess an important difference between humans and mice here is that mice are copraphagic, which probably wildly increases the rate of microbiome transfer. Is that worth pointing out here? Also, mice share genetic and food too which also probably helps community sharing.

    3. The total amount of strain sharing varied widely across households

      This variability is wild! Do you know if diet can help explain any of these differences? Do people who co-habitate but don't share strains have very different diets ?

    4. showing that subjects can have heterogeneous responses to the same antibiotic perturbation.

      Is looking at antibiotic resistance in these communities useful here? Either from CFU plating on cipro, or using tools like AMR finder? Perhaps that could explain some of the variation in response here.

  10. Oct 2023
    1. Mobilome genes

      Did you ever see IS elements on phages (or plasmids) themselves? The phages might be transporting the elements around the community and could explain some of the community differences is IS element distribtions. I know that in the lab phages can get IS elements, and wild plasmids sometimes have them too, curious if this is a common enough thing for you to spot!

    2. susC, susD, or tonB receptor genes

      tonB is a phage receptor - I wonder if the IS mutants have a fitness advantage due to phage resistance

    3. A barrier to studying IS elements in such complex environments results from imperfect methodologies for measuring in situ IS element dynamics. This stems from the poor recovery of multi-copy genes with repetitive sequences by short-read assemblers, leading to fragmented assemblies where IS elements are either absent or become break-points between contigs (20).

      This is a really excellent point - I really like your study design and your motivation for studying IS diversity.

      If you are looking to expand your database or your approaches more - one thought I had is that spacegraphcats could be helpful. Its explicitly designed to work of assembly graphs and so might be able to catch IS elements that are stuck in tangled regions of the graph https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02066-4

    4. Some individuals had few detectable IS element insertions while others had over 100 unique IS element insertions (Fig. 2B)

      This was suprising to me that some people had so few IS sequences- i would have expected the number of IS sequences across a full metagenome to be much higher. Do you interpret this to mean that that there is a still a lot of room to grow the IS database? My default expectation would be that a metagenome would have thousands of IS sequences, given their ubiquity as MGEs.

    5. The ISOSDB also has a wide range of transposases representing multiple IS families

      Without knowing a lot about how IS families are designated, I was a little confused by this phylogenetic tree. My naive assumption would have been that all transpoases from an insertion sequence family would form a clade together - so you would end up with a tree that resolves out the 8 different familes as clades. However, there is a lot of intermixing of these families, suggesting that the transposon protein sequences themselves dont from robust clades.

      It might be helpful in the text to explain a bit more about the different families and how they are determined.

      It would also be helpful to explain what is driving the patterns on the tree, since it from family level differences.

  11. Aug 2023
    1. Alkaloid-binding globulin (ABG) represents a new small molecule binding functionality in serpin proteins, a novel mechanism of plasma alkaloid transport in poison frogs, and more broadly points towards serpins acting as tunable scaffolds for small molecule binding and transport across different organisms.

      This is such a cool paper, and exciting conclusion!

    2. The lower affinity of OsABG provides further support to the hypothesis that OsABG may be acting as a transporter protein to other tissues, and would be in line with the hypothesis that there may be other mechanisms involved in autoresistance to circulating alkaloids not bound by protei

      I'm curious about how the affinity for OsABG for alkaloids compares to the affinity of these alkaloids for their molecular targets. I would imagine that a sponge/autoresistance protein would need a higher affinity for the alkaloid than the target in order to compete effectively for the molecule, whereas a transport protein may not have such strict requirements.

    3. Recombinant ABG proteins were expressed by Kemp Proteins (Maryland, USA) through their custom insect cell protein expression and purification services.

      I'm curious if you tried to express these in e. coli or yeast first before escalating to insect cells? Would you mind sharing some of your decision making around expression platforms for these molecules?

    1. Thus, only MAGs which showed the expected coverage profiles were labeled as transducers

      Lytic phages can also participate in generalized transduction, this approach doesn't account for that . By requiring that a prophage be present for peDNA to "count" as being transduced, I feel like you might be missing a lot of instance of legitimate viral transduction coming from non-integrating phages

    2. MAG was labeled as ‘EV produce

      Does this account for transduction-like mechanisms? It seems like you could also get this signal from viruses packaging up host DNA, I dont see why this would necessarily require an extracellular vesicle

    3. Calculation of the percentage of non-viral associated reads

      This metric is calculated only using reads that map to assembled contigs (either viral or non-viral) , which is going to be a very biased subset of reads. The contigs that are assembling well from a "virome" prep but that are not predicted to be viruses by VirSorter and DeepVirFinder may be other types of MGEs. It seems difficult to get assembly of reads that are derived from host DNA that is being sporadically packaged by GTAs or EVs, with the exception of cases in which there is strong enrichment for packaging of a particular host sequence. I think that a more fair calculation of viral vs. non-viral reads would be to use a read classifier (potentially using a custom database that includes your verified viral contigs) , and to report the number of viral reads, bacterial reads, and unclassifiable reads .

    4. SSU rRNA hits in these datasets are enclosed in VLPs, GTA’s or EVs

      Is it possible to distinguish between SSU rRNA coming from EVs vs. VLPs/GTAs by using chloroform treatment to disrupt the lipid vesicles?

    5. 1.81

      Can you clarify if all the reads from Prochlorococcus EV prep map back to the Prochlorococcus genome? What do the rest of the reads pertain to?

    6. DNA extracted from virus isolates, purified by sequential plaque assays

      Can you clarify 1) what host(s) are these viruses are infecting and 2) how the viral particles were purified? In my mind, plaque purification relates to genetically purifying a phage isolate to make sure that the phage population is isogenic, but the purification in terms of separating host from phage would be physical in nature . How did you recover the phage only fraction in this experiment?

  12. Jul 2023
    1. Bacterial cGAS-like enzymes produce 2′,3′-cGAMP to activate an ion channel that restricts phage replication

      Really interesting paper! I appreciated the mixes of methodologies used in this study - it was an enjoyable read!

    2. These experiments suggest that active BtCap14 transports Cl- from the cis to trans chambers.

      its not immediately clear to me why chloride transport on its own would block phage replication - I'm quite curious to know what is happening in infected cells. There are some methodologies for measuring ion flux during phage infection in living cells. It could be really useful to apply them to your system, to 1) confirm that chloride efflux is happening in vivo and 2) to maybe measure flux of other types of ions to see if this is part of a large system of changing the concentration gradients of different ions across the cell membrane in a way that might be more mechanistically connected to antiphage defense.

      You could also consider staining the membranes of cells that have been recently infected with phage to see if the membrane potential is disrupted.

      Lastly, Im curious if this system is sensitve to extracellular (or extra-vesicualar) chloride/anion concentrations. Different media have different chloride levels, it could be really informative to grow the bacterial in a higher concentration of NaCl for instance and see if the activity of the system is affected

    3. SPβ or Φ29

      Can you briefly comment on how similar or different all these B. subtilus phages are from one another? Are these different classes of phages (myo vs podo vs sipho) or different life styes (lytic vs temperate) or having any other notable features of their biology that could be driving this (genome modification, shell formation, etc) ?

    4. 104-fold protection against phages SPP1 and phiB002

      I'm curious about the survivors that are plaquing here...are these escapers? if you pick of these plaques, are they sensitive or resistant to your system?

    5. OD600 of cultures expressing wild-type and catalytically inactive BtCBASS both collapsed.

      Its difficult to visualize this collapse with the way that the graphs are plotted on what looks like a log10 scale (2F). It would be easier to see this point with a linear scale, and you could also potentially consider extending the time frame of the growth period for longer to help the reader get a sense of the growth dynamics.

      I have done a lot of experiments like this varying phage MOI and measuring both bacterial OD over time in a plate reader and phage output at the end of 24 hours - often you can get pretty clear binary results at the end of a long time course.

  13. Jun 2023
    1. The MDR E. faecalis strain KUB3006, for which a completely closed genome is available (44), possesses 4 plasmids, including one that encodes linezolid resistance. Yet, it also encodes CRISPR3-Cas (44). The cas9 gene is frameshifted, and the frameshift occurs within the codon for one of the two Cas9 active sites that we previously experimentally confirmed in E. faecalis

      Very cool!

    2. Overall, we posit that the interplay of CRISPR-Cas, plasmids, and antibiotic selection should be further investigated to understand the role of CRISPR-Cas in the antibiotic resistance crisis.

      It would be fascinating to see how the presence of phage might shift this balance!

    3. No mutations were identified in the S6 protospacer or the PAM region of the repB gene in pAM714.

      Curious that no protospacer mutations were found! I'm assuming that the plasmid replicates to a high copy number than the host chromsome, meaning there would be more opportunity for mutation to happen in the plasmid relative to the chromosome. I wonder if you would see the same patterns if the adaptation machinery were absent from the strain. Can you comment on why you think chromosomal mutations are favored over plasmid mutations ?

    4. We sequenced the cas9 coding region of the WT4 population from Day 0 and identified a mutation resulting in an Ala749Thr substitution. Ala749 occurs within the RuvC nuclease domain and is conserved in the well-studied Cas9 of Streptococcus pyogenes (17). The Ala749Thr substitution may result in loss in Cas9 function, causing WT4 to phenocopy the Δ1-Δ6 populations

      Interesting that this this mutation was present in day 0 - do you think this Cas9 deactivation mutation was pre-existing in the WT population, and the conjugation experiment directly selected for it? Do you think that there is any low level toxicity of the CRISPR system in WT strains (even in the absence of selection) that might be generating CRISPR-null mutants at some rate?

  14. May 2023
    1. AKT is at the center of a multitude of different cellular processes ranging from the cell cycle, apoptosis, cell survival, glucose metabolism, and the immune system

      Phage T4 has a glucose-containing modified base - glucosyl-hydroxymethylcytosine (glc-HMC). I'm wondering if the extra glucose coming in on the DNA could be driving some of this signaling? There are mutants of phage T4 that don't have this mod that could be used for a control here. I was also generally curious as to how the DNA modification might impact recognition by intracellular DNA sensors.

  15. Apr 2023
    1. Kraken2 (See Methods)

      Do you have a sense for how robust Kraken2 should be to the degree of polymorphisms that you see across tectiviruses? Since its a kmer-based tool at some point enough polymorphisms would break it but idk where that threshold is relative to what you see in natural tectivirus populations. Overall I'm really fascinated that these viruses are evading your detection in metagenomes, and im super curious as to why.

      Another approach would to assemble the viromes and then see if you can pull out contigs with BLAST hits to tectiviral marker genes. You could also try querying the for the marker genes directly on the assembly graph with a tool like spacegraphcats to ensure that you aren't missing the virus due to assembly failure.

      By assembling the metavirome that would also allow you to demonstrate some of the feature of the viruses that you do recover using these methods. Are there dsDNA viruses smaller that tectiviruses recovered robustly? I would imagine yes bc even tho 12kb is smallish, its not unreasonably tiny. Some of the phages that I've worked w in metagenomes have also been around this size range and they are incredibly abundant and easy to find, which makes me feel like something else is going on.

    2. Viral

      Overall amazing work! The assay is really clever, and you pulled out a ton of cool phage diversity. I really liked the phenotypic host range assessment - it was a suprise that you got such different phenotypes from such similar genotypes. Congrats on a fascinating paper!

    3. no accessory genes

      wild that they don't have accessory genes!!!

    4. Still, the discrepancy points to the continued need for systematic culture-based viral discovery.

      It would be helpful to evaluate these possibilities more systematically in the section of the paper where you discuss the absence of tectiviruses from metagenomic sequencing datasets.

      1. Did you have any trouble isolating DNA from the tectiviruses in culture that would hint at their exclusion from bulk DNA harvests? If you suspect the the nature of the DNA is the problem, you could consider looking at RNAseq datasets for tectiviral reads (see https://pubmed.ncbi.nlm.nih.gov/32517038/)

      2. When you sequenced the wastewater, did you find the amount of tectivirus reads you expected to? Some of your figures might be getting at that, but they are not discussed in the text (4D, E) making it hard to know what your interpretation is.

      3. If you did recover a reasonable amount of tectivirus reads from your sequencing, did they assemble into genomes? Assembly issues feels the most likely culprit to me, given how shockingly similar their genome content is, alongside the high levels of nt diversity at some positions. Two ways to approach this problem would be to either work directly from assembly graphs using a tool like spacegraphcats (https://github.com/spacegraphcats) or to use long read sequencing. There might already be some good wastewater long read metagenomes out there to look at.

      4. If assembly isn't the problem and they do assemble into genomes, are those genomes found by tools like genomad (https://github.com/apcamargo/genomad) which was used to make the current iteration of img/vr?

    5. However, none of the uncultivated viral genomes appear to belong to any of the pre-existing groups of isolated tectiviruses

      It might be helpful to include your isolated phages in this tree to see IMG genomes cluster with them. Since you only have one representative of alphatectiviruses on the tree it makes it more challenging to conclude the relatatedness of the IMG genomes to isolated alphatectiviruses more broadly

    6. alphatectiviruses have yet to be found in metagenomic analyses

      What do you mean no one has found them? Below, you say that you were able to retrieve viral genomes from IMG using the PRD1 coat protein which is at odds with that statement.

    7. c,

      Do you mean D?

    8. We speculate that these patterns might reflect the compositions of natural polymicrobial communities containing IncP plasmids, which require PDPs to rapidly adapt to infect particular assortments of taxonomically distant hosts.

      This patchy host range could also be due to specialized anti-PRD immune systems where protection against one PRD doesn't necessarily guarantee protection from a closely related PRD

    9. This observation formed the basis of the targeted phage discovery method we termed “Phage discovery by coculture” (Phage DisCo)

      This is such a clever approach! I really love it. You could imagine expanding to all sorts of other questions about phage host range, like finding phages that are resistant to different bacteria immune systems.

    1. two-step ABP aiming to target enzymes with the desired activity among the microbial enzyme repertoire directly at the site of sampling (eABPP)

      Can you clarify if these enzymes are secreted enzymes that are stable in the environment, or if they are expected to be intracellular proteins that are exposed upon lysis conditions? Based on your methodology, it seems like you are mostly profiling extracellular serine hydrolases - can you explain the motivation for studying secreted proteins instead of the full proteome?

    2. extraction of eDNA

      Can you clarify if the DNA is environmental in source - ie "eDNA" that is extracellular and released into the environment. Or, is the DNA from the microbial community? I'm guessing the latter since that is where you would get good assembly and bins. In that case, would you consider calling the DNA "community DNA" instead of "eDNA"? eDNA has a very specific definition as environmental DNA (ie extracellular), which I don't think what you mean here.

    3. This is really interesting work ! The approach of environmental ABPP is clever and has a lot of potential applications. It is a nice way improve our understanding functional potential of uncultivated microorganisms and their encoded proteins. This example of serine hydrolase discovery is a nice vignette applying the protocol, and I look forward to future developments in this area!

    1. This is a really impressive and large experiment that clearly took a lot of time and resources. I imagine many folks interested interspecies interactions, particularity between algae and bacteria would be interested in looking at this data.. It seems like this would be a rich resource to answer many different questions. Maybe we missed this - but are the 16s data deposited anywhere to enable reuse?

    1. Super exciting work! I was especially intrigued by your report of a very large prophage (~300 kb). I would recommend double checking that result manually by looking at the contig and the gene content of the putative prophage, as virus finders can have false positives and can incorrectly call the borders of prophages, especially if they are integrated into a lot of other mobile DNA. If the prophage really is ~300 kb, that would be super exciting! Large phages are often lytic or pseudolysogenic (ie dont integrate), and prophages tend to be small relative to non-integrating phage. what a lovely paper!

    1. This is a really interesting technique, with a lot of potential use cases for exploring genome modification via sequencing. I am most excited about the potential to use this technique to detect phage genome modification in natural samples, as well as to enrich modified phage sequences and promote phage MAG assembly from metagenomes. You could imagine that this would help pull out low coverage phage genomes, if the phage had the right modification. If you want to expand beyond T4, good group of phages to look at next would the the 5hmdU-containing Bacillus phages, like SP8 or SPO1. It doesn’t seem like 5hmdU is as broadly protective against restriction enzymes as T4’s methyl-glucosylation, but you could experiment with boosting the efficiency of protection by treating with sample with a 5hmdU DNA kinase (see here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7653180/ , commercially available through NEB). I hope you keep developing this technique for other types of mods!

    1. One such AMG was a darB-like antirestriction gene encoded on the Thiohalocapsa phage MD04 genome (Suppl. Data 1). Interestingly, darB has been shown to methylate phage DNA to resist host restriction modification (RM) systems (Iida et al., 1987; Iyer et al., 2017).

      Because auxiliary metabolic genes (AMGs) more commonly reflects instances where phages enhance bacterial energy metabolisms, you might want to consider calling these methylases "anti-defense" genes to more accurately reflect their proposed ecological role.

    2. 48 unique repeat sequences from four reference genomes for known pink berry-associated bacteria:

      This level of repeat diversity is much larger that I would have expected and implies that each reference genome has an extremely large number of independent CRISPR-Cas systems. Alternativley, the the repeat finder is erroneously detecting CRISPR repeats. Can you please add more information about what types of CRISPR Cas systems you are finding in each of these reference genomes to contextualize this repeat diversity?

    3. The remaining five genomes of interest lacked sufficient protein similarity for a connection in the vConTACT network, indicating that these phages represent novel and undescribed diversity.

      RefSeq is not a great representation of total viral diversity, making it difficult to evaluate viral novelty simply from Pink Berry phages not being closely related to RefSeq viruses.

      Could you compare instead to viruses MAGs from ecologically similar samples - like other marshes - to determine their relative novelty?

    4. 2,802 unique CRISPR spacer sequences

      This is a really high number of unique CRISPR spacers. CRISPR arrays in bacteria tend to be small-ish (less than 50 or so spacers) while archaea can be have larger arrays (less than 100 or so). Since there are only 4 different bacterial strains present, retrieving thousands of spacers would require a huge number of independent CRISPR-Cas systems in these strains, and/or extremely rapid CRISPR adaptation. Can you add more information to contextualize this finding?

    5. Moreover, although most host spacers matched to a single virus, two spacers from the Rhodobacteraceae host aligned to Thiohalocapsa phage MD04 and Desulfofustis phage MD02

      Can you clarify what % ID this match is?

    1. This is a really interesting and rigorous work structurally characterizing the Mcr complex and the role of McrD. Its really impressive that you purified the complex from the native system. Congrats and good work! A few questions:

      I found it striking that McrD was not essential for Mcr complex formation, despite being so well conserved.

      1. Does McrD have any recognizable domains or folds that could tell you about how it might be participating in Mcr complex biogenesis? Can this tell you anything about its specific evolutionary history or mechanism of action?

      2. Is McrD found in organisms that don't have the other pathway members? If so, what other cellular processes could it be involved in?

  16. Mar 2023
    1. Moreover, although most host spacers matched to a single virus, two spacers from the Rhodobacteraceae host aligned to Thiohalocapsa phage MD04 and Desulfofustis phage MD02

      Can you clarify what % ID this match is?

    2. 2,802 unique CRISPR spacer sequences

      This is a really high number of unique CRISPR spacers. CRISPR arrays in bacteria tend to be small-ish (less than 50 or so spacers) while archaea can be have larger arrays (less than 100 or so). Since there are only 4 different bacterial strains present, retrieving thousands of spacers would require a huge number of independent CRISPR-Cas systems in these strains, and/or extremely rapid CRISPR adaptation. Can you add more information to contextualize this finding?

    3. 48 unique repeat sequences from four reference genomes for known pink berry-associated bacteria:

      This level of repeat diversity is much larger that I would have expected and implies that each reference genome has an extremely large number of independent CRISPR-Cas systems. Alternativley, the the repeat finder is erroneously detecting CRISPR repeats. Can you please add more information about what types of CRISPR Cas systems you are finding in each of these reference genomes to contextualize this repeat diversity?

    4. The remaining five genomes of interest lacked sufficient protein similarity for a connection in the vConTACT network, indicating that these phages represent novel and undescribed diversity.

      RefSeq is not a great representation of total viral diversity, making it difficult to evaluate viral novelty simply from Pink Berry phages not being closely related to RefSeq viruses.

      Could you compare instead to viruses MAGs from ecologically similar samples - like other marshes - to determine their relative novelty?

    5. One such AMG was a darB-like antirestriction gene encoded on the Thiohalocapsa phage MD04 genome (Suppl. Data 1). Interestingly, darB has been shown to methylate phage DNA to resist host restriction modification (RM) systems (Iida et al., 1987; Iyer et al., 2017).

      Because auxiliary metabolic genes (AMGs) more commonly reflects instances where phages enhance bacterial energy metabolisms, you might want to consider calling these methylases "anti-defense" genes to more accurately reflect their proposed ecological role.

  17. Feb 2023
    1. This is a really interesting and rigorous work structurally characterizing the Mcr complex and the role of McrD. Its really impressive that you purified the complex from the native system. Congrats and good work! A few questions:

      I found it striking that McrD was not essential for Mcr complex formation, despite being so well conserved.

      1. Does McrD have any recognizable domains or folds that could tell you about how it might be participating in Mcr complex biogenesis? Can this tell you anything about its specific evolutionary history or mechanism of action?

      2. Is McrD found in organisms that don't have the other pathway members? If so, what other cellular processes could it be involved in?

  18. Dec 2022
    1. This is really interesting work ! The approach of environmental ABPP is clever and has a lot of potential applications. It is a nice way improve our understanding functional potential of uncultivated microorganisms and their encoded proteins. This example of serine hydrolase discovery is a nice vignette applying the protocol, and I look forward to future developments in this area!

    2. extraction of eDNA

      Can you clarify if the DNA is environmental in source - ie "eDNA" that is extracellular and released into the environment. Or, is the DNA from the microbial community? I'm guessing the latter since that is where you would get good assembly and bins. In that case, would you consider calling the DNA "community DNA" instead of "eDNA"? eDNA has a very specific definition as environmental DNA (ie extracellular), which I don't think what you mean here.

    3. two-step ABP aiming to target enzymes with the desired activity among the microbial enzyme repertoire directly at the site of sampling (eABPP)

      Can you clarify if these enzymes are secreted enzymes that are stable in the environment, or if they are expected to be intracellular proteins that are exposed upon lysis conditions? Based on your methodology, it seems like you are mostly profiling extracellular serine hydrolases - can you explain the motivation for studying secreted proteins instead of the full proteome?

  19. Nov 2022
    1. This is a really interesting technique, with a lot of potential use cases for exploring genome modification via sequencing. I am most excited about the potential to use this technique to detect phage genome modification in natural samples, as well as to enrich modified phage sequences and promote phage MAG assembly from metagenomes. You could imagine that this would help pull out low coverage phage genomes, if the phage had the right modification. If you want to expand beyond T4, good group of phages to look at next would the the 5hmdU-containing Bacillus phages, like SP8 or SPO1. It doesn’t seem like 5hmdU is as broadly protective against restriction enzymes as T4’s methyl-glucosylation, but you could experiment with boosting the efficiency of protection by treating with sample with a 5hmdU DNA kinase (see here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7653180/ , commercially available through NEB). I hope you keep developing this technique for other types of mods!

    1. Review information: Review of version 2 of the manuscript

      DOI: 10.22541/au.166017966.64892451/v1

      Reviewer: Adair Borges, Scientist at Arcadia Science. Expertise in bioinformatics and CRISPR-Cas biology.

      My main comments are about further clarification of nomenclature and classification both regarding Serratia species classification and CRISPR-Cas classification. One of the most confusing things about the CRISPR field the nomenclature of CRISPR systems. This is not the authors fault, but it does mean that it is critical to use extremely clear definitions to enable comparisons of the current findings to past and future work.

      Also, bacterial species classifications and re-classifications can be fraught also and often create confusion. The authors do a pretty good job of addressing some of the Serratia reclassifications, but I would really like it if they including a a short explanation of the proposed reclassification of ATCC 39006 since it is such a prominent model in the field (more on that below).

      Also, more clarity in the Data Availability Statement would be excellent and help others reuse these findings!

      Specifics:

      • Thanks for the clarification in the your response about the proposed reclassification of ATCC 39006. Would it be possible for the authors to add a brief note to the main text mentioning the proposed re-classification of ATCC 39006 to explain why no Type III systems were found here? I think its important since many readers (like myself) might think of Serratia as a “model” for Type III CRISPR, and might be confused that Type III is missing here.
      • I appreciate the authors changing the nomenclature from “variant” to “unique locus”. Can you also add a definition for “canonical” loci, like in line 179? Are you calling it canonical because it is similar to previously identified CRISPR-Cas loci in Serratia? Or in the case of I-C, how do you judge if it is canonical or not? What are locus architectures are they being compared to? Please spell out in the main text exactly why you are categorizing some loci as canonical and some as atypical.
      • If the Cas gene sequences are available through CRISPRCas++ database, can the authors explicitly state that in the text? The Data Availability Statement should be made more clear to say exactly what data are included as Supp Files, and which data should be retrieved from external databases. In the R&R, the authors indicate that the 16s data also would need to be retrieved using accession numbers, so this should be mentioned too.
  20. Sep 2022
    1. This is a really impressive and large experiment that clearly took a lot of time and resources. I imagine many folks interested interspecies interactions, particularity between algae and bacteria would be interested in looking at this data.. It seems like this would be a rich resource to answer many different questions. Maybe we missed this - but are the 16s data deposited anywhere to enable reuse?

    1. Super exciting work! I was especially intrigued by your report of a very large prophage (~300 kb). I would recommend double checking that result manually by looking at the contig and the gene content of the putative prophage, as virus finders can have false positives and can incorrectly call the borders of prophages, especially if they are integrated into a lot of other mobile DNA. If the prophage really is ~300 kb, that would be super exciting! Large phages are often lytic or pseudolysogenic (ie dont integrate), and prophages tend to be small relative to non-integrating phage. what a lovely paper!

  21. Aug 2022
    1. Review information: Review of version 1 of the manuscript

      DOI: 10.22541/au.166017966.64892451/v1

      Reviewer: Adair Borges, Scientist at Arcadia Science. Expertise in bioinformatics and CRISPR-Cas biology.

      Big picture:

      Serratia encodes a number of CRISPR-Cas systems, and is a powerful model for the study of CRISPR-Cas biology in a natural system. Scrascia et al. use bioinformatic tools to survey the distribution of CRISPR-Cas systems across 15 Serratia species (482 genomes in total). They report the identification of Type IE and IF similar to those previously observed in Serratia, as well as Type IC systems, which has not been previously observed in Serratia. This work extends a body of knowledge about Serratia CRISPR-Cas diversity and distribution, and will be especially useful in continuing to grow Serratia as a model system in the study of CRISPR-Cas biology in bacteria.

      Feedback on CRISPR-Cas analysis and nomenclature:

      • I was surprised to see that the authors did not detect any Type III CRISPR-Cas systems in this study, since Type III CRISPR-Cas systems are present in Serratia (at least in Serratia  sp. ATCC 39006) and well studied (eg. Patterson et al, 2016 doi: 10.1016/j.molcel.2016.11.012, Malone et al, 2020 doi: https://doi.org/10.1038/s41564-019-0612-5, Smith et al 2021 doi: 10.1038/s41564-020-00822-7, and many more) Can the authors hypothesize why they only detected Type I CRISPR systems in this analysis? It would be very helpful to have an acknowledgment of Type III in the paper, and a brief explanation of why this current analysis did not detect any Type III systems.
      • I was a bit confused with the variant nomenclature I-ES1, I-ES2, and I-F1S1 used in this paper. Since there are some well described Type IF variants that have profound functional differences from canonical CRISPR-Cas systems (like transposase-associated F2 variants) it would be really helpful if the authors could clarify in the text if the “S1” and “S2” naming is specific to this study, and whether or not they expect these variant systems to be functional in anti-phage immunity. From first glance it appears that the variants differ in the locus organization, but would still be expected to function canonically given that they have all the genes necessary for interference. To reduce confusion, it might be helpful to remove the emphasis on “variants” and instead refer to them exclusively as unique locus architectures or Cas gene organizations.
      • I was also unclear on the naming of “alien” vs. “canonical” arrays. The authors define the alien arrays as arrays that were “not consistent with the subtype of the cas gene set harbored in the same genome”. From this definition I wasn’t sure if this meant that the alien array was unclassified by the bioinformatic tools used, or if it was classified as being associated with another CRISPR-Cas subtype. It would be helpful if the authors could disambiguate these two circumstances since they have different implications. In the cases that the alien arrays are unclassified, can the authors refer to them as unclassified? And if they are classified as arrays belonging to another subtype, can the authors provide the subtype that they are associated with?

      Main Figures:

      • Figure 2: To get across the different locus organizations found in this study, it would be clearer to color different Cas genes in different colors, to highlight that they are found in different arrangements. The gene names are small and a little hard to read.

      • Figure 3: I like that the authors included the raw numbers of genomes analyzed vs. genomes with CRISPR Cas systems detected, but the Y-axis is misleading. While the Y-axis goes to 50, the highest values on this chart are almost 10-fold higher. For the S. marcescens genomes, this compresses the difference between total genomes analyzed (382) and the number that are CRISPR-positive (48). Other differences are likewise compressed. I understand what the authors are trying to do here, but it is still a little misleading visually. A broken Y-axis might help here.

      • Figure 4 and Figure 5: A legend with the species color would be super helpful here! The text is tiny. Also bootstrap values/symbols would aid in the interpretation of the tree, and is especially important to support the authors claims about the evolution of sub-lineages.

      • Figure 6: I think its really cool that the authors discovered these conserved hot spots for CRISPR-Cas integration! This figure, combined with Table S4 is my favorite part of the study.

      Supplementary Files

      • Would it be possible for the authors to supply the supplementary tables in Excel/csv/tsv format? These are really nice genomic resources, but it is very hard to interact with them in pdf format. The 16s tree would likely be better as a newick file (or whatever text-based tree file went into this) - in its current state as a very large pdf if it likewise difficult to interact with.
      • Please consider adding the underlying tree file used to generate Figure 4
      • To enable reuse of this data, it would be really helpful to have the Cas gene sequences in addition to the spacer sequence. Would the authors consider adding Cas gene sequences as a supplementary file?

      General notes on clarity:

      • The current title “CRISPR-Cas systems in Serratia” is a little broad, and feels more suited to a review than a research paper. Something like “Bioinformatic survey of CRISPR-Cas systems across 15 Serratia species” would be more specific and helpful in demonstrating where this paper fits into existing literature.

      • I appreciate the authors’ attempts to streamline the manuscript by using abbreviations for commonly used phrases, but sometimes I got confused with some of the abbreviations that were less standard to the field (CG, HQA, CDR for example). I would recommend just writing out “complete genomes” instead of using the abbreviation “CGs” and “high-quality assemblies” instead of “HQAs”. Also while “DRs” is a common abbreviation for “direct repeats”, I would advise writing out “consensus DRs” instead of abbreviating to “CDRs”.