Reviewer #3 (Public Review):
This manuscript explores the concept of TCR convergence, defined here as the presence of TCRs with the same amino acid sequence but distinct nucleotide sequences. The central premise is that TCR convergence is a sign of antigen-driven selection. TCR convergence as a biomarker for immune checkpoint blockade (ICB) response is also investigated. Although both these ideas have been put forward in the literature, this manuscript provides some new analyses and a new perspective on these topics.
Main results:
- TCR convergence is different from publicity: The authors look at CDR3 sequence features of convergent TCRs in the large Emerson CMV cohort. Amino usage does not perfectly correlate with codon degeneracy, for example, arginine (which has 6 codons) is less common in convergent TCRs, whereas leucine and serine are elevated. It's argued that there's more to convergence than just recombination biases, which makes sense. (I wonder if the trends for charged amino acids could be explained by the enrichment of convergent TCRs in CD8 T cells, which tend to have more acidic CDR3 loops). There's also a claim that the overlap between convergent and public TCRs is lower in tumors with a high mutational burden (TMB), but this part is sketchy: the definition of public TCRs is murky and hard to interpret, and the correlation between TMB and convergence-publicity overlap is modest (two cohorts with low TMB have higher overlap, and the other three have lower, but there is no association over those three, if anything the trend is in the other direction). It's also not clear why the overlap between COVID19 cohort convergent TCRs and public TCRs defined by the pre-2019 Emerson cohort should be high. A confounder here is the potential association between convergence and clonal expansion since expanded clonotypes can spawn apparently convergent TCRs due to sequencing errors. The paper "TCR Convergence in Individuals Treated With Immune Checkpoint Inhibition for Cancer" (Ref#5 here) gives evidence that sequencing errors may be inflating convergence in this specific dataset.
- Convergent TCRs are more likely to be antigen-specific: This is nicely shown on two datasets: the large dextramer dataset from 10x genomics, and the COVID19 datasets from Adaptive biotech. But given previous work on TCR convergence, for example, the Pogorelyy ALICE paper, and many others, this is also not super-surprising.
- Convergent T cells exhibit a CD8+ cytotoxic gene signature: This is based on a nice analysis of mouse and human single-cell datasets. One striking finding is that convergent TCRs are WAY more common in CD8+ T cells than in CD4+ T cells. It would be interesting to know how much of this could be explained by greater clonal expansion of CD8+ T cells, together with sequencing errors. A subtle point here is that some of the P values are probably inflated by the presence of expanded clonotypes: a group of cells belonging to the same expanded clonotype will tend to have similar gene expression (and therefore similar cluster membership), and will necessarily all be either convergent or not convergent collectively since they share the same TCR. So it's probably not quite right to treat them as independent for the purposes of assessing associations between gene expression clusters and convergence (or any other TCR-defined feature). You can see evidence for clonal expansion in Figure 3C, where TRAV genes are among the most enriched, suggesting that Cluster 04 may contain expanded clones.
- TCR convergence is associated with the clinical outcome of ICB treatment: The associations for the first analysis are described as significant in the text, and they are, but just barely (0.045 and 0.047, but you have to check the figure to see that).
- Introduction/Discussion: Overall, the authors could do a better job citing previous work on convergence, for example, papers from Venturi on convergent recombination and the work from Mora and Walczak (ALICE, another recombination modeling). They also present the use of convergence as an ICB biomarker as a novel finding, but Ref 5 introduces this concept and validates it in another cohort. Ref 5 also has a careful analysis of the link between sequencing errors and convergence, which could have been more carefully considered here.