On 2020-03-09 19:39:13, user Fraser Lab wrote:
This manuscript by Leander, M., et al, uses TetR as a model system to explore the robustness of an allosteric response (in this case coupling drug and DNA binding) to mutation. This paper uses high throughput mutational scanning to identify variants that compromise function using FACS coupled to deep sequencing. As a follow-up the authors conduct a break-and-restore secondary screen where they generated libraries in the backgrounds of 5 deleterious mutations to identify rescuing suppressor mutations with FACS followed up by sampling with sanger sequencing. They use structural modeling (in particular rosetta and MD) to develop potential mechanistic explanations for these mutations.
Overall, the data presented shows that empirically identified allosteric residues appear to be distributed across TetR, are not conserved, and have a variety of structural mechanisms potentially underlying them. The authors take this to mean that broadly, allostery is distributed and not conserved. The generality of the present approach is perhaps a bit overstated ("profound impact", “radically reframe”), but this is a great example of leveraging the classic strategy of identifying suppressor mutants using a functional screen while taking advantage of the new power and massively parallel nature of modern high throughput sequencing. With the focus on plasticity and robustness there could be increased citations/discussion of previous work on protein robustness and strategies involving suppressor mutations. Many of their conclusions could be put in context with previous work on allostery in this system (see: Reichheld and Davidson, PNAS, 2009), which puts forth an alternative subdomain folding model that is not really considered here.
One of the main arguments in the introduction is that previous works weren’t comprehensive. From our reading, only one experiment, presented in the structural hotspots more conserved than allosteric’ section, measured all (or a nearly comprehensive set) of the mutations with deep sequencing. While the libraries were made it is unclear why sanger sequencing as opposed to sanger sequencing was used for the break-and-restore experiments. Moreover, the paper does not make clear which statistical tests are used to validate qualitative observations. For example, somewhat arbitrary thresholds are set and used to define where a region is an allosteric hotspot. In general, the thermodynamic coupling between one residue to another is not binary and so it does not make sense to treat the data qualitatively. It makes more sense to develop a quantitative score for whether a residue is allosteric or not based on deep scanning mutational data. For example if some mutations are harder to rescue you should expect not only less residues will rescue them but those that have to should have higher coupling then those that are easier to rescue- a core argument in the paper. This should be measured and tested quantitatively. Percentages should be reported somewhere regarding each of the rescued background libraries. It’s quite possible all this data is there, just not presented clearly.
Similarly, if the assignment of allostery is made quantitative it would be easy to calculate correlation between allosteric residues and conservation or as is it would be easy to calculate the z score between the conservation of dead vs allosteric residue populations. This would quantitatively back up the claim of the paper that residues allosteric residues are not conserved. There are many other examples throughout the paper where it would be appropriate to do a statistical test.
Overall, the paper is hard to follow as written. For example, it is confusing that the mutations in various mutational backgrounds are presented prior to the single mutational data. Perhaps it would make more sense if the single mutation datasets were presented first, followed by the rescuing mutations in the background of these mutations. It is unclear as is whether the deep sequencing data from the single mutational libraries were used in deciding mutations to be used as backgrounds for the second order mutations.
The major successes of the paper are the “break-restore” cycle of mutagenesis and integrating one potential structural framework to develop mechanistic explanations for some mutations which is often the lacking step in deep scanning mutational studies. The major concern we have with this data is that the timescale of the MD simulations (while still impressively long microseconds) is still insufficient to get at many issues of folding of subdomains (see again Reichheld and Davidson) and other aspects of the conformational ensemble that may mediate allostery in this system (esp. if it is not simply a matter of an “active” and an “inactive” structure).
Specific points:
Throughout the paper, it is unclear why methods were chosen, how assays were developed, and whether statistical tests were done. Some examples:<br />
* How were libraries generated? Chip-DNA is not sufficient information. Looks like from the methods inverse pcr and golden gate was used. High level information should be in the body of the paper. How do these libraries compare to similarly generated libraries? <br />
* There are triple mutations in the library. Where did these come from?<br />
* Nowhere in the paper are the quality of the libraries discussed. How much WT is present? How many variants were observed of the possible variants? How much coverage on the effective size of the library (considering WT) at the sorting/sequencing? Baseline library statistics (WT %, % present, bias) is needed to determine how well NGS experiments went.<br />
* How was the threshold for ‘low’ GFP decided on? Were any controls used? More broadly, were controls used to determine any thresholds? Example raw data for this experiment should be in the supplement.<br />
* In the disrupt and restore first step experiment presented in Fig1C it’s mentioned that there were many mutations that disrupted but 5 were chosen as background for secondary libraries. How many mutations were disruptive? Was this the data presented later in fig3? Or if not from the experiment presented in Fig3 this primary screen should be in the supplement. Why these 5 apart from them being distributed across TetR? Strongest signal? Did they represent distinct clusters? <br />
* How is partial vs full rescue of function described? How do you think about positions that can have varied impacts of rescue vs those that have a range of responses? For example D53V and N129D seem to all be rescued more or less the same amount whereas (impossible to know as a reader without statistics...) R49A and especially G102D have vastly different responses. <br />
* Fig1C ranks mutants by mean. Ranking by mean does not seem appropriate based on the fact that G102D in Fig1C is the second most easily rescued whereas in Fig2B it is the hardest to rescue. This seems odd. In the next section this idea is discussed somewhat and maybe does not make sense to rank order this data.<br />
* How and why were thresholds chosen? Why couldn’t this same analysis be done in Fig1C data by binning fluorescence? If 1000 mutants were done why are there not 1000 mutants in FigS3? Where is that raw data?<br />
* The authors discuss that rescuing residues are either unique to a given mutant background or shared across multiple. They call this ‘ variant-specific regional bias’. However, only 200 out of a possible of ~3000 variants per background are sampled so it is hard to know whether this analysis is meaningful. It is unclear why these experiments were done with clonal sequencing and not illumina sequencing. An added benefit would be being able to do thermodynamic cycle calculations mutations to quantify the coupling between all mutations. This would just require sequence baseline libraries as well.
* 5/20 mutations having a signal was used as a threshold for allosteric residue classification. This seems somewhat arbitrary unless this was quantitatively determined to be a good threshold. It makes more sense for every residue to get a coupling score based on depletion of weighted sequencing reads and have a statistically defined threshold (R packages like DESEQ2 can do this easily) for calling residue allosterically coupled.<br />
* Thermodynamic coupling is not binary so enrichments could be quantitative. Then it will be easier to judge the data and easier to calculate statistics. How many residues were missing from the dataset? How common are allosteric sites? Looking at FigS4 it is hard to know if white residues are missing data or mutations that don’t meet the cutoff.<br />
* A statistical test could be used to back up the statement that allosteric residues aren’t conserved. As is or it would be easy to calculate the z score between the conservation of dead vs allosteric residue populations. Really there should be a quantitative score that could be used to calculate correlations between conservation and later centrality.<br />
* A baseline high throughput experiment was done without ligand to see how TetR is inhibited without induction. The authors interpret GFP no ligand mutations as destabilizing DNA binding. However, mutations could alternatively impact baseline expression through TetR structure disruption or dimerization. This should be mentioned<br />
* Why was a triple mutant chosen for the rescued MD simulations when H44F had a stronger signal (Fig 1C)? Also, a double mutant would be better to limit higher order epistatic effects.<br />
* In figure 4d there do appear to be broadening in the distributions and a shift to To the left two populations. Is this meaningful? Is there any insight into why the triple mutant isn’t all the way back to WT?
Throughout the manuscript there are broad generalizations that are not consistent with our view of the literature. Here are some examples:<br />
* Authors discuss TetR having a high degree of allosteric capacity based on the results. However, without more datasets or discussing previous work in this space it is hard to say whether TetR has a high allosteric plasticity.<br />
* The authors postulate that ease of rescuing a dead variant may correlate with how stabilized the inactive state of the protein is. However, the literature has certainly considered this and should be discussed/cited if this section remains. <br />
* The authors talk about how their work radically reframes the problem and is very impactful. We will leave the impact for history, but this is a pretty classic strategy and we fail to see what is “radical” about it. It is a great example of using modern technology on a “classic” system - that is cool!
Throughout the manuscript there are explanations whereby the logic is unclear. Here is an example that would benefit from further explanation: <br />
* In after the site-specific mutation section the authors conduct rosetta modeling to develop putative mechanistic explanations for several of the mutations. Here the authors see reduced helix-turn-helix stability however there is no explanation of it’s significance.
Insufficient background/missing citations<br />
Through the manuscript there is lacking background and many missing citations. Here are some examples:<br />
* ‘Thermodynamic does not require spatial connectivity’ should have a citation<br />
* ‘Allosteric signaling occurs through redundant and robust networks’ based on one example from one paper it is improper to generalize. There should be citations here as there are certainly more examples of allostery being redundant.<br />
* The authors discuss allosteric hotspots but do not cite work here that came up with the concept. For example, earlier in the paper Rama Ranganathan’s work is cited and should be again here.<br />
* Citations needed that identified mutations in DBD and LBD<br />
* Centrality is a used to identify residues associated with allostery. The authors mention that in some instances it does not predict their allosteric classification. How does this compare to previous evaluations of centralities performance as an allosteric metric?<br />
* More discussion of how the field views the conservation of allostery would be good. Overall, it’s not entirely novel that allosteric sites are not as conserved as Though it’s not necessarily novel that allosteric sites are not as conserved as catalytic/binding sites. Fig1b of Yang J-S, Seo SW, Jang S, Jung GY, Kim S (2012) Rational Engineering of Enzyme Allosteric Regulation through Sequence Evolution Analysis. PLoS Comput Biol 8(7): e1002612.
A major rationale and point the authors make in the introduction is that previous studies have been exhaustive, however many of the examples the authors give are clonal experiments with limited sample size. Some examples:<br />
* If this is 200 variants per position this is nowhere near exhaustive. How is there only 1 variant for G102D in fig2a when in 1C there were more? Were any statistical thresholds used for the data in Fig 2b? <br />
* The authors discuss that rescuing residues are either unique to a given mutant background or shared across multiple. They call this ‘ variant-specific regional bias’. However, only 200 out of a possible of ~3000 variants per background are sampled so it is hard to know whether this analysis is meaningful. It is unclear why these experiments were done with clonal sequencing and not illumina sequencing. An added benefit would be being able to do thermodynamic cycle calculations mutations to quantify the coupling between all mutations. This would just require sequence baseline libraries as well.
Figures<br />
1B<br />
It would be nice to see raw data somewhere for gating. To get a sense of what the library data looked like. It is unclear why only the top and bottom gates were collected and not a series of bins. It would also be good to get a sense of what percentage of the population these gates represented.<br />
Fig 1C<br />
How many replicates were done for each? There should be extensive statistical tests here between mutants, wt and background single mutations. <br />
Why are there triple mutants? Seems triple mutants shouldn’t be included as that starts moving into high order epistatic space and is hard to discuss.<br />
Unclear why mean was used to range order these as clearly several don’t fall quite inline especially G102D<br />
Fig1D<br />
Hard to read labels. Poor contrast.<br />
Fig 2A<br />
Seeing the raw data for these would be good. I don’t think it’s appropriate to use binning for this data and instead there should be a numerical value for fold induction. Then induction could be scored quantitatively. Also, need for statistical tests.<br />
Fig2B The raw data for this would be good to have in the supplemental figures<br />
Fig2C<br />
Hard to read residue labels, It would be nice to have an example that has an allosteric explanation. As all of these are just direct interactions.<br />
Fig2D<br />
This hypothesis could have been more fully tested if full libraries were characterized<br />
Fig3A<br />
Really hard to interpret this. The distribution are clear but there should be quantitative comparison.<br />
Fig3C <br />
Same comment as fig 3A.<br />
Fig 3D<br />
Need better labeling. What is top and bottom? Also pointing out where the modelled residues are in 3C would be good.
Grammar:<br />
There are missing ‘a’, ‘the’, etc but here are some examples as well as a couple of other issues:
Page3:Line7<br />
‘the’ decentralized<br />
Page3:Line10<br />
Unclear what ‘they’ refers to. <br />
Page4:Line5<br />
‘Time and again’ and ‘myriad’ are redundant<br />
Page4:Line14<br />
‘a’ biochemical understanding<br />
Page4:Lines19-20 <br />
‘a’ promoter and ‘that’ promoter<br />
Page6:Line11: <br />
‘a’ high degree<br />
Page6:Line16 ‘<br />
allosteric’ signaling<br />
Page7:Line11 <br />
Break up the one massive paragraph after sentence 10 in the site-specific rescuability of allosteric dysfunction section.<br />
Page8:Line15<br />
Why are hotpots in parentheses? This is confusing.
We were prompted to review this by a journal, James Fraser and Willow Coyote-Maestas