4 Matching Annotations
  1. Jul 2018
    1. On 2017 Jan 23, Andy Collings commented:

      (Original comment found at: https://elifesciences.org/content/6/e17044#disqus_thread)

      Response to “Replication Study: Discovery and preclinical validation of drug indications using compendia of public gene expression data”

      Atul J Butte, Marina Sirota, Joel T Dudley

      We represent three of the key authors of the original work.

      In October 2013, we were pleased to see that our original 2011 publication (Sirota et al., 2011) was selected as one of the top 50 influential cancer studies selected for reproducibility. Our initial impression, probably like most investigators reading this letter, was that such recognition would be a mixed blessing for us. Most of our work for this paper was conducted in 2009, 4 years prior to us being approached. We can see now that this reproducibility effort is one of the first 10 to be completed, and one of the first 5 to be published, more than 3 years later. The reproducibility team should be commended on their diligence to repeat experimental details as much as possible.

      The goal of the original study was to evaluate a prediction from a novel systematic computational technique that used open-access gene-expression data to identify potential off-indication therapeutic effects of several hundred FDA approved drugs. We chose to evaluate cimetidine based on the biological novelty of its predicted connection to lung cancer and availability of local collaborators in this disease area.

      The key experiment replicated here involved 18 mice treated with three varying doses of cimetidine (ranging from 25 to 100 mg/kg) administered via intraperitoneal injection daily to SCID mice after implantation of A549 human adenocarcinoma cells, along with 6 mice treated with doxorubicin as a positive control, and 6 mice treated only with vehicle as a negative control. The reproducibility team used many more mice in their experiment, but tested only the highest dose of cimetidine.

      First, it is very important to clearly note that we are truly impressed with how much Figure 1 in the reproducibility paper matches Figure 4c in our original paper, and this is the key finding that cimetidine has a biological effect between PBS/saline (the negative control) and doxorubicin (the positive control). We commend the authors for even using the same colors as we did, to better highlight the match between their figure and ours.

      While several valid analytic methods were used on the new tumor volume data, the analysis most similar to the original was the t-test we conducted on the measurements from day 11, with 100 mg/kg cimetidine compared to vehicle control. The new measurements were evaluated with a Welch t-test yielding t(53) = 2.16, with p=0.035. We are extremely pleased to see this raw p-value come out from their experiment.

      However, the reproducibility team then decided to apply a Bonferroni adjustment, resulting in a corrected p=0.105. While this Bonferroni adjustment was decided a priori and documented (Kandela et al., 2015), we fundamentally do not agree with their approach.

      The reproducibility team took on this validation effort by starting with our finding that cimetidine demonstrated some efficacy in the pre-clinical experiments. However, our study did not start with that prediction. We started our experiments with open data and a novel computational effort. Readers of our original paper (Sirota et al., 2011) will see that we started our study much earlier in the process, with publicly-available gene expression data on drugs and diseases, and computationally made predictions that certain drugs could be useful to treat certain conditions. We then chose cimetidine and lung adenocarcinoma from among the list of significant drug-disease pairs for validation. This drug-disease pairing was statistically significant in our computational analysis, which included the formal evaluation of multiple-hypothesis testing using random shuffled data and the calculation of q-values and false discovery rates. These are commonly used methods for controlling for the testing of multiple hypotheses. Aside from the statistical significance, local expertise in lung cancer and the availability of reagents and A549 cells and mouse models in our core facilities guided the selection. We then chose an additional pairing that we explicitly predicted (by the computational methodology) would fail. We again used cimetidine and found we had ACHN cells that could represent a model of renal cancer. Scientists will recognize this as a negative control.

      At no point did we feel the comparison of cimetidine against A549 cells had anything to do with the effect of cimetidine in ACHN cells; these were independently run experiments. The ACHN cell test was to test the specificity of the computational process upstream of all of this; it had nothing to do with our belief in cimetidine in A549 cells. Thus, we would not agree with the replication team’s characterization that these were all multiple hypotheses being validated equally, and thus merited a common adjustment of p-values. As described above, we corrected for the multiple hypothesis testing earlier in our process, at the computational stage. We never expected the cimetidine/ACHN experiment to succeed when we ran it. Similarly, our test of doxorubicin in A549 cells was performed as a positive control experiment; we fully expected that experiment to succeed.

      In email discussion, we learned the replication team feels these three hypotheses were tested equally, and thus adjusted the p-values by multiplying them by 3. We are going to have to respectfully “agree to disagree” here.

      We note some interesting results of their adjustments, such as the reproducibility team also not finding doxorubicin to have a statistically significant effect compared to vehicle treated mice. Again, the Welch’s t-test on this comparison yielded p=0.0325, but with their Bonferroni correction, this would no longer be deemed a significant association. Doxorubicin has been used as a known drug against A549 cells for nearly 30 years (Nishimura et al, 1989), and our use of this drug was only as a positive-control agent.

      Figure 3 was also very encouraging, where we do see a significant effect from the original and reproduced studies, and the meta-analysis together.

      In the end, we want to applaud replication efforts like this. We do believe it is importance for the public to have trust in scientists, and belief in the veracity of our published findings. However, we do recommend replication teams of the future to choose papers in a more impactful manner. While it is an honor for our paper to be selected, we were never going to run a clinical trial of cimetidine in lung adenocarcinoma, and we cannot see any such protocol being listed in clinicaltrials.gov. Our publication was more towards demonstrating the value of open data, through the validation of a specific computational prediction. We suggest that future replication studies of pre-clinical findings should really be tailored towards those most likely to actually be heading into clinical trials.

      References

      Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. Discovery and preclinical validation of drug indications using compendia of public gene expression data. Sci Transl Med. 2011 Aug 17;3(96):96ra77. doi: 10.1126/scitranslmed.3001318.

      Kandela I, Zervantonakis I; Reproducibility Project: Cancer Biology. Registered report: Discovery and preclinical validation of drug indications using compendia of public gene expression data. Elife. 2015 May 5;4:e06847. doi: 10.7554/eLife.06847.

      Nishimura M, Nakada H, Kawamura I, Mizota T, Shimomura K, Nakahara K, Goto T, Yamaguchi I, Okuhara M. A new antitumor antibiotic, FR900840. III. Antitumor activity against experimental tumors. J Antibiot (Tokyo). 1989 Apr;42(4):553-7.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2017 Jan 20, Robert Tibshirani commented:

      The Replication Study by Kandela et al of the Sirota et al paper “Discovery and Preclinical Validation of Drug Indications Using Compendia of Public Gene Expression Data“ reports a non-significant p-value of 0.105 for the test of the main finding for cimetidine in lung adenocarcinoma. They obtained this from a Bonferroni adjustment of the raw p-value of 0.035, multiplying this by three because the authors had also tested a negative and a positive control.

      This seems to me to be an inappropriate use of a multiple comparison adjustment. These adjustments are designed to protect the analyst against errors in making false discoveries. However if Sirota et al had found that the negative control was significant, they would not have reported it as a "discovery". Instead, it would have pointed to a problem with the experiment. Similarly, the significant result in the positive control was not considered a "discovery" but rather was a check of the experiment's quality.

      Now it is true that Kandela et al specified in their protocol that they would use a (conservative) Bonferroni adjustment in their analysis, and used this fact to choose a sample size of 28. This yielded an estimated power of 80%. If they had chosen to use the unadjusted test, the estimated power for n=28 would have been a little higher—about 90%. I think that the unadjusted test is appropriate here.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

  2. Feb 2018
    1. On 2017 Jan 20, Robert Tibshirani commented:

      The Replication Study by Kandela et al of the Sirota et al paper “Discovery and Preclinical Validation of Drug Indications Using Compendia of Public Gene Expression Data“ reports a non-significant p-value of 0.105 for the test of the main finding for cimetidine in lung adenocarcinoma. They obtained this from a Bonferroni adjustment of the raw p-value of 0.035, multiplying this by three because the authors had also tested a negative and a positive control.

      This seems to me to be an inappropriate use of a multiple comparison adjustment. These adjustments are designed to protect the analyst against errors in making false discoveries. However if Sirota et al had found that the negative control was significant, they would not have reported it as a "discovery". Instead, it would have pointed to a problem with the experiment. Similarly, the significant result in the positive control was not considered a "discovery" but rather was a check of the experiment's quality.

      Now it is true that Kandela et al specified in their protocol that they would use a (conservative) Bonferroni adjustment in their analysis, and used this fact to choose a sample size of 28. This yielded an estimated power of 80%. If they had chosen to use the unadjusted test, the estimated power for n=28 would have been a little higher—about 90%. I think that the unadjusted test is appropriate here.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2017 Jan 23, Andy Collings commented:

      (Original comment found at: https://elifesciences.org/content/6/e17044#disqus_thread)

      Response to “Replication Study: Discovery and preclinical validation of drug indications using compendia of public gene expression data”

      Atul J Butte, Marina Sirota, Joel T Dudley

      We represent three of the key authors of the original work.

      In October 2013, we were pleased to see that our original 2011 publication (Sirota et al., 2011) was selected as one of the top 50 influential cancer studies selected for reproducibility. Our initial impression, probably like most investigators reading this letter, was that such recognition would be a mixed blessing for us. Most of our work for this paper was conducted in 2009, 4 years prior to us being approached. We can see now that this reproducibility effort is one of the first 10 to be completed, and one of the first 5 to be published, more than 3 years later. The reproducibility team should be commended on their diligence to repeat experimental details as much as possible.

      The goal of the original study was to evaluate a prediction from a novel systematic computational technique that used open-access gene-expression data to identify potential off-indication therapeutic effects of several hundred FDA approved drugs. We chose to evaluate cimetidine based on the biological novelty of its predicted connection to lung cancer and availability of local collaborators in this disease area.

      The key experiment replicated here involved 18 mice treated with three varying doses of cimetidine (ranging from 25 to 100 mg/kg) administered via intraperitoneal injection daily to SCID mice after implantation of A549 human adenocarcinoma cells, along with 6 mice treated with doxorubicin as a positive control, and 6 mice treated only with vehicle as a negative control. The reproducibility team used many more mice in their experiment, but tested only the highest dose of cimetidine.

      First, it is very important to clearly note that we are truly impressed with how much Figure 1 in the reproducibility paper matches Figure 4c in our original paper, and this is the key finding that cimetidine has a biological effect between PBS/saline (the negative control) and doxorubicin (the positive control). We commend the authors for even using the same colors as we did, to better highlight the match between their figure and ours.

      While several valid analytic methods were used on the new tumor volume data, the analysis most similar to the original was the t-test we conducted on the measurements from day 11, with 100 mg/kg cimetidine compared to vehicle control. The new measurements were evaluated with a Welch t-test yielding t(53) = 2.16, with p=0.035. We are extremely pleased to see this raw p-value come out from their experiment.

      However, the reproducibility team then decided to apply a Bonferroni adjustment, resulting in a corrected p=0.105. While this Bonferroni adjustment was decided a priori and documented (Kandela et al., 2015), we fundamentally do not agree with their approach.

      The reproducibility team took on this validation effort by starting with our finding that cimetidine demonstrated some efficacy in the pre-clinical experiments. However, our study did not start with that prediction. We started our experiments with open data and a novel computational effort. Readers of our original paper (Sirota et al., 2011) will see that we started our study much earlier in the process, with publicly-available gene expression data on drugs and diseases, and computationally made predictions that certain drugs could be useful to treat certain conditions. We then chose cimetidine and lung adenocarcinoma from among the list of significant drug-disease pairs for validation. This drug-disease pairing was statistically significant in our computational analysis, which included the formal evaluation of multiple-hypothesis testing using random shuffled data and the calculation of q-values and false discovery rates. These are commonly used methods for controlling for the testing of multiple hypotheses. Aside from the statistical significance, local expertise in lung cancer and the availability of reagents and A549 cells and mouse models in our core facilities guided the selection. We then chose an additional pairing that we explicitly predicted (by the computational methodology) would fail. We again used cimetidine and found we had ACHN cells that could represent a model of renal cancer. Scientists will recognize this as a negative control.

      At no point did we feel the comparison of cimetidine against A549 cells had anything to do with the effect of cimetidine in ACHN cells; these were independently run experiments. The ACHN cell test was to test the specificity of the computational process upstream of all of this; it had nothing to do with our belief in cimetidine in A549 cells. Thus, we would not agree with the replication team’s characterization that these were all multiple hypotheses being validated equally, and thus merited a common adjustment of p-values. As described above, we corrected for the multiple hypothesis testing earlier in our process, at the computational stage. We never expected the cimetidine/ACHN experiment to succeed when we ran it. Similarly, our test of doxorubicin in A549 cells was performed as a positive control experiment; we fully expected that experiment to succeed.

      In email discussion, we learned the replication team feels these three hypotheses were tested equally, and thus adjusted the p-values by multiplying them by 3. We are going to have to respectfully “agree to disagree” here.

      We note some interesting results of their adjustments, such as the reproducibility team also not finding doxorubicin to have a statistically significant effect compared to vehicle treated mice. Again, the Welch’s t-test on this comparison yielded p=0.0325, but with their Bonferroni correction, this would no longer be deemed a significant association. Doxorubicin has been used as a known drug against A549 cells for nearly 30 years (Nishimura et al, 1989), and our use of this drug was only as a positive-control agent.

      Figure 3 was also very encouraging, where we do see a significant effect from the original and reproduced studies, and the meta-analysis together.

      In the end, we want to applaud replication efforts like this. We do believe it is importance for the public to have trust in scientists, and belief in the veracity of our published findings. However, we do recommend replication teams of the future to choose papers in a more impactful manner. While it is an honor for our paper to be selected, we were never going to run a clinical trial of cimetidine in lung adenocarcinoma, and we cannot see any such protocol being listed in clinicaltrials.gov. Our publication was more towards demonstrating the value of open data, through the validation of a specific computational prediction. We suggest that future replication studies of pre-clinical findings should really be tailored towards those most likely to actually be heading into clinical trials.

      References

      Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. Discovery and preclinical validation of drug indications using compendia of public gene expression data. Sci Transl Med. 2011 Aug 17;3(96):96ra77. doi: 10.1126/scitranslmed.3001318.

      Kandela I, Zervantonakis I; Reproducibility Project: Cancer Biology. Registered report: Discovery and preclinical validation of drug indications using compendia of public gene expression data. Elife. 2015 May 5;4:e06847. doi: 10.7554/eLife.06847.

      Nishimura M, Nakada H, Kawamura I, Mizota T, Shimomura K, Nakahara K, Goto T, Yamaguchi I, Okuhara M. A new antitumor antibiotic, FR900840. III. Antitumor activity against experimental tumors. J Antibiot (Tokyo). 1989 Apr;42(4):553-7.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.