Reviewer #2 (Public Review):
New comments are added after authors responses to my initial comments.
Summary:
Zhang et al. performed a proteogenomic analysis of lung adenocarcinoma (LUAD) in 169 female never-smokers from the Xuanwei area (XWLC) in China. These analyses reveal that XWLC is a distinct subtype of LUAD and that BaP is a major risk factor associated with EGFR G719X mutations found in the XWLC cohort. Four subtypes of XWLC were classified with unique features based on multi-omics data clustering.
Strengths:
The authors made great efforts in performing several large-scale proteogenomic analyses and characterizing molecular features of XWLCs. Datasets from this study will be a valuable resource to further explore the etiology and therapeutic strategies of air-pollution-associated lung cancers, particularly for XWLC.
Weaknesses:
[...]
(2) Importantly, while providing the large datasets, validating key findings is minimally performed, and surprisingly there is no interrogation of XWLC drug response/efficacy based on their findings, which makes this manuscript descriptive and incomplete rather than conclusive. For example, testing the efficacy of XWLC response to afatinib combined with other drugs targeting activated kinases in EGFR G719X mutated XWLC tumors would be one way to validate their datasets and new therapeutic options.
Response: We appreciate your suggestion. In reference to testing the efficacy of XWLC response to afatinib combined with drugs targeting kinases, we have planned to establish PDX and organoid models to validate the effectiveness of our therapeutic approach. Due to the extended timeframe required, we intend to present these results in a subsequent study.
Comments: All conclusions in the manuscript made by authors are based on interpretations of large-scale multi-omics data, which should be properly validated by other approaches and methods. Without validation, these are all speculations and any conclusions without supporting evidence are not acceptable. This reviewer suggested an example of validation experiment, and Reviewer #3 also pointed out several data that need to be validated. However, authors do not agree to perform any of these validation experiments without reasonable justification.
(3) The authors found MAD1 and TPRN are novel therapeutic targets in XWLC. Are these two genes more frequently mutated in one subtype than the other 3 XWLC subtypes? How these mutations could be targeted in patients?
Response: Thank you for your question. We have investigated the TPRN and MAD1 mutations in our dataset, identifying five TPRN mutations and eight MAD1 mutations. Among the TPRN mutations, XWLC_0046 and XWLC_0017 belong to the MCII subtype, XWLC_0012 belongs to the MCI subtype, and the subtype of the other three samples is undetermined, resulting in mutation frequencies of 1/16, 2/24, 0/15, and 0/13, respectively. Similarly, for the MAD1 mutations, XWLC_0115, XWLC_0021, and XWLC_0047 belong to the MCII subtype, XWLC_0055 containing two mutations belongs to the MCI subtype, and the subtype of the other three samples is undetermined, resulting in mutation frequencies of 1/16, 3/24, 0/15, and 0/13 across subtypes, respectively. Fisher's test did not reveal significant differences between the subtypes. For targeting novel therapeutic targets such as MAD1 and TPRN, we propose a multi-step approach. Firstly, we advocate for conducting functional in vivo and in vitro experiments to verify their roles during cancer progression. Secondly, we suggest conducting small molecule drug screening based on the pharmacophore of these proteins, which may lead to the identification of potential therapeutic drugs. Lastly, we recommend testing the efficacy of these drugs to further validate their potential as effective treatments.
Comments: Please properly incorporate the above explanation into the main text.
(4) In Figures 2a and b: while Figure 2a shows distinct genomic mutations among each LC cohort, Figure 2b shows similarity in affected oncogenic pathways (cell cycle, Hippo, NOTCH, PI3K, RTK-RAS, and WNT) between XWLC and TNLC/CNLC. Considering that different genomic mutations could converge into common pathways and biological processes, wouldn't these results indicate commonalities among XWLC, TNLC, and CNLC? How about other oncogenic pathways not shown in Figure 2b?
Response: Thank you for your question. Based on the data presented in Fig. 2a, which encompasses all genomic mutations, it appears that the mutation landscape of XWLC bears the closest resemblance to TSLC (Fig. 2a). However, when considering oncogenic pathways (Fig. 2b) and genes (Fig. 2c), there is a notable disparity between the two cohorts. These findings suggest that while XWLC and TSLC exhibit similarities in terms of genomic mutations, they possess distinct characteristics in terms of oncogenic pathways and genes.<br />
Regarding the oncogenic signaling pathways, we referred to ten well-established pathways identified from TCGA cohorts. These members of oncogenic pathways are likely to serve as cancer drivers (functional contributors) or therapeutic targets, as highlighted by Sanchez-Vega et al. in 2018(Sanchez-Vega et al., 2018).
Comments: It is unclear to this reviewer how authors defined "distinct characteristics" in terms of oncogenic pathways and genes. Would 10-20% differences in "Fraction of samples affected" in Fig2b be sufficient to claim significance? How could authors be sure whether mutations in genes involved in each oncogenic pathway are activating or inactivating mutations (rather than benign, thus non-affecting mutations)?
[...]
(6) Supplementary Table 11 shows a number of mutations at the interface and length of interface between a given protein-protein interaction pair. Such that, it does not provide what mutation(s) in a given PPI interface is found in each LC cohort. For example, it fails to provide whether MAD1 R558H and TPRN H550Q mutations are found significantly in each LC cohort.
Response: We appreciate your careful review. In Supplementary Table 11, we have provided significant onco_PPI data for each LC cohort, focusing on enriched mutations at the interface of two proteins. Our emphasis lies on onco_PPI rather than individual mutations, as any mutation occurring at the interface could potentially influence the function of the protein complex. Thus, our Supplementary Table 11 exclusively displays the onco_PPI rather than mutations. MAD1 R558H and TPRN H550Q were identified through onco_PPI analysis, and subsequent extensive literature research led us to focus specifically on these mutations.
Comments: Are authors referring to Table S9 (Onco_PPIs identified in four cohorts) instead of Supplementary Table 11? There is no Table 11 among submitted files. In Table S9, the Column N (length of protein product of gene1) does not make sense: MYO1C (8152), TP53 (3924), EGFR (12961). These should not be the number of amino acids residues of each protein. Then, what do these numbers mean?
(7) Figure 7c and d are simulation data not from an actual binding assay. The authors should perform a biochemical binding assay with proteins or show that the mutation significantly alters the interaction to support the conclusion.
Response: We appreciate your suggestion. The relevant experiments are currently in progress, and we anticipate presenting the corresponding data in a subsequent study.
Comments: The suggested experiment is to support the simulated data. Again, without supporting experimental results, authors could not make a conclusion simply based on simulated data. Where else could the supporting experimental results be presented?