6 Matching Annotations
  1. Jul 2018
    1. On 2016 Jan 14, Arturo Casadevall commented:

      The central criticism is that we have compared variables for which there is no causal relationship. We recognize the difficulties involved in assuming causality and dangers of spurious correlations when plotting unrelated variables. Furthermore, we are fully aware that correlation is not causation. However, the criticism made by Levine and Weinstein does not take into account a large body of published scholarly work showing that spending of public funds translates into medical goods such as new therapeutics. To make this point, we note the findings of several studies. In 2000, a United States Senate Report found that of the 21 most important drugs introduced between 1965-1992, 15 (71%) ‘were developed using knowledge and techniques from federally funded research’ (http://www.faseb.org/portals/2/pdfs/opa/2008/nih_research_benefits.pdf). A recent study of 26 transformative drugs or drug classes found that for many, their discovery was made with governmental support [1]. Numerous other studies have reinforced this point [2-4]. Kohout-Blume estimated that a 10% increase in targeted funding for specific diseases produced a 4.5% increase in the number of drugs reaching clinical trials after an average lag of 12 years [5]. In our own investigations, we have traced the ancestry of most of the drugs licensed in the past four decades to publicly funded research (unpublished data). The literature in this field overwhelmingly supports the notion that public spending in biomedical research translates into public goods. The debate is not about whether this happens but rather about the magnitude of the effect. The notion that public funding in biomedical research generates basic knowledge that is subsequently used in drug development is a concept accepted by most authorities. Hence, the use of the NIH budget as a proxy for public spending in biomedical research is appropriate.

      We are aware that establishing causal relationships among non-experimental variables can be a daunting task. However, we note that the relationship between public spending and medical goods does meet some of the essential criteria needed to establish causality. First, the relationship between these variables meets the requirement of temporal causality since, for many drugs, publicly funded basic research precedes drug development. Second, we also have mechanistic causality, since knowledge from basic research is used in designing drugs. There are numerous examples of mechanistic causality including the finding of receptors with public funds that are then exploited in drug development when industry generates an agonist or inhibitor. We acknowledge that we do not know if the relationship between public spending and drug development is linear, and the precise mathematical formulation for how public spending translates into medical goods is unknown. In the absence of evidence for a more complex relationship, a linear relationship is a reasonable first approximation, and we note that other authorities have also assumed linear relationships in analyzing inputs and outcomes in the pharmaceutical industry. For example, Scannell et al. [6] used a similar analysis to make the point that ‘The number of new drugs approved by the US Food and Drug Administration (FDA) per billion US dollars (inflation-adjusted) spent on research and development (R&D) has halved roughly every 9 years’.

      The authors claim to have done a causality analysis of the data generated in our paper, concluding that ‘We do not find evidence that NIH budget ⇒ NME (p=0.475), and thus it may not be a good indicator of biomedical research efficiency.’ However, this oversimplifies a very complex process of how public spending affects NME development; we do not agree that this simple analysis can be used to deny causality. Although the limited information provided in their comment does not permit a detailed rebuttal, we note that a failure to reject the Granger causality null hypothesis does not necessarily indicate the absence of causality. Furthermore, Granger causality refers to the ability of one variable to improve predictions of the future values of a second variable, which is distinct from the philosophical definition of causality. Whether or not NIH budget history adds predictive ability in determining the number of NMEs approved at some point in the future cannot negate the fact that basic biomedical research funding unequivocally influences the creation of future drugs, as well as many other outcomes. Therefore, we stand by our use of NIH funding and NMEs as indicators of biomedical research inputs and outcomes.

      The authors suggest that another study by Rzhetsky et al. [7] contradicts the findings of our paper and provides a better method of studying biomedical research efficiency. The work by Rzhetsky et al., while very interesting, addresses a fundamentally different question relating to how scientists can most efficiently choose research topics to explore a knowledge network [7]. The allocation of scientists to research topics is undoubtedly a possible contributor to overall research efficiency, but the approach used in this analysis is very different from our broader analysis of the biomedical research enterprise as a whole. The work in [7] has a narrow scope and does not attempt to study the impact of research investments in producing societal outcomes. The central conclusion of our paper is that biomedical research inputs and outputs are increasing much faster than outcomes, as measured by NMEs and LE.

      We do not ‘conjecture that a lack relevance or rigor in biomedical research’ is solely responsible for this phenomenon, as Levine and Weinstein assert. Instead, our paper discusses a number of possible explanations—many of which have been previously identified in the literature [6-12], including several that agree with the conclusions of Rzhetsky et al. [7]. However, the recent epidemic of retracted papers along with growing concerns about the reproducibility of biomedical studies, expressed in part by pharmaceutical companies dedicated to the discovery of NMEs [13, 14], are indisputable facts. If a substantial portion of basic science findings are unreliable, this is likely to contribute to reduced productivity of the research enterprise. We agree with the suggestion that research difficulty increases as a field matures, which has been made by others [6]; this does not contradict our analysis and is mentioned in our paper’s discussion. Biomedical research efficiency is complex, and it is likely that the decline in scientific outputs has numerous causes. It is appropriate for scientists to consider any factors that may be contributing to this trend, and the comments from Dr. Schuck-Paim in this regard (see the other posted comments) are therefore welcome.

      In summary, we do not find the arguments of Levine and Weinstein to be compelling. We note that other investigators have come to conclusions similar to ours [6, 15]. The productivity crisis in new drug development has been intensively discussed for at least a decade [6, 15-16]. We believe that addressing inefficiencies in biomedical research is essential to maintain public confidence in science and, by extension, public funding for basic research.

      Arturo Casadevall and Anthony Bowen

      [1] Health Aff (Millwood ) 2015, 34:286-293.

      [2] PNAS 1996, 93:12725-12730.

      [3] Am J Ther 2002, 9:543-555.

      [4] Drug Discov Today 2015, 20:1182-1187.

      [5] J Policy Anal Manage 2012, 31:641-660.

      [6] Nat Rev Drug Discov 2012, 11:191-200.

      [7] PNAS 2015, 112:14569-14574.

      [8] Res Policy 2014, 43(1):21–31.

      [9] Nature 2015, 521(7552):270–271.

      [10] Br J Cancer 2014, 111(6):1021–1046.

      [11] Nature 2015, 521(7552):274–276.

      [12] J Psychosom Res 2015, 78(1):7–11.

      [13] Nature 2012, 483(7391):531-533.

      [14] Nat Rev Drug Discov 2011, 10(9):712.

      [15] Nat Rev Drug Discov 2009, 8:959-968.

      [16] Innovation policy and the economy 2006, 7:1-32.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2016 Jan 08, Michael LeVine commented:

      In this recent article, Bowen and Casadevall attempt to classify biomedical research efficiency in terms of the ratio of outcomes to input. The outcomes were chosen to be approved new molecule entities (NMEs) and US life expectancy (LE); the chosen input was the NIH budget. While the resulting analysis claims that efficiency has decreased in the last decade, we argue that (i)-the analysis performed is insufficient to make that claim, and (ii)- the findings do not support the conjecture that a lack of relevance or rigor in biomedical research is causing stagnation in medicine and public health.

      Bowen and Casadevall suggest that because research projects take time to complete, it is possible that “the exponential growth in research investment and scientific knowledge over the previous five decades has simply not yet grown fruit and that a deluge of medical cures are right around the corner”. They investigate time-lagged efficiency for NMEs, but this analysis is only sufficient if there is a linear causal relationship between the two variables that is unaffected by any external variables that have not been included in the analysis. Without any evidence of such a relationship, it is unwise to interpret a trend in a ratio between two unassociated measurements. Just as two unrelated measurements can display a spurious correlation, the ratio between those measurements may display a spurious trend.

      We reanalyzed the data used in this paper to find evidence of causal relationships between the inputs and outcomes. To do this, we tested for Granger causality (1), which identifies potentially causal relationships by determining whether the time series of one variable is able to improve the forecasting of another. We analyzed the non-stationary time series from 1965-2012 using the Toda and Yamamoto method (2), which utilizes vector autoregression. We will refer to a variable X improving the forecasting of a variable Y as X ⇒ Y.

      We do not find evidence that NIH budget ⇒ NME (p=0.475), and thus it may not be a good indicator of biomedical research efficiency. However, we do find evidence that NIH budget ⇒ LE (p<10-8) and NIH budget ⇒ publications (p<machine precision). Notably, however, both VAR models utilize the maximum possible time lags (15, as selected using the Akaike information criterion (3), coincidentally the same number as used in this paper), and do not pass the Breusch-Godfrey test (4) for serially correlated residuals. As this suggests that more time lags are required to build appropriately rigorous models, it seems unwise to over-interpret the potential Granger causal relationships or make any comparisons between the Granger causality during different periods of time, until significantly more time points are available. Even with additional data, the serial correlation in the residuals might not be alleviated without the use of more complex models including non-linear terms or external variables. All three of these possible limitations also affect the analysis in this paper, but no statistical tests were performed there to assess the robustness of results.

      We conclude that this published study of biomedical research efficiency is insufficient methodologically because models of greater complexity are required. From our reanalysis, we are not able to support the hypothesis that biomedical research efficiency is decreasing. Instead, we can only conclude that from 1965-2012, the NIH budget may have a causal effect on LE and publication, but that more time points are required to improve the models.

      Another recent work (5), which aimed to study the scientific process and its efficiency, also suggested that the efficiency of biomedicinal chemistry research (defined in terms of the number of experiments that would need to be performed to discover a given fraction of all knowledge) has decreased over time. However, the analysis in (5) also suggested that even the optimal research strategy would eventually display a decrease in efficiency, due to the intrinsic increase in the difficulty of discovery as a field matures. While there are additional limitations and assumptions involved in this analysis, (5) provides an example of the level of complexity and quantitative rigor required to study research efficiency, and implies an alternative explanation for the potential reduction in biomedical research efficiency. Considering the findings in (5), we deem the hypotheses proposed in this paper, which suggests that a lack of relevance or rigor in biomedical research is causing stagnation in medicine and public health, to be unfounded.

      We write this comment in part because unfounded defamatory claims directed at the scientific community are dangerous in that they may negatively affect the future of scientific funding. Such claims should not be made lightly, and the principle of parsimony should be invoked when less defamatory alternative hypotheses are available.

      Michael V. LeVine and Harel Weinstein

      (1) Granger CWJ (1969) Investigating Causal Relations by Econometric Models and Cross-spectral Methods. Econometrica 37(3):424–438.

      (2) Toda HY, Yamamotob T (1995) Statistical inference in vector autoregressions with possibly integrated processes. J Econom 66:225–250.

      (3) Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723.

      (4) Breusch TS (1978) Testing for autocorrelation in dynamic linear models. Aust Econ Pap 17(31):334–355.

      (5) Rzhetsky A, Foster JG, Foster IT, Evans JA (2015) Choosing experiments to accelerate collective discovery. Proc Natl Acad Sci:201509757.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    3. On 2016 Jan 14, Arturo Casadevall commented:

      We appreciate the comment by Dr. Schuck-Paim and we fully agree that increased transparency and reporting of data during the process of therapeutic development can only benefit the scientific enterprise and enable a more efficient use of limited resources. Some of the other mentioned issues, including relevance of animal models, underpowered study design, and errors during data analysis and reporting, have all been implicated in the literature as referenced by Dr. Schuck-Paim and cited in our paper. We agree that each of these issues merits attention and expect that new tools will need to be developed to address some problems. One example would be the development of drug screening chips containing human cells as an alternative to some animal models, which may have poor predictability of a drug’s toxicity in humans (http://www.ncats.nih.gov/tissuechip).

      Anthony Bowen

      Arturo Casadevall


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    4. On 2015 Nov 05, Cynthia Schuck-Paim commented:

      In exploring progress in biomedical research, the authors show that human life expectancy and the number of new molecular entities (NME) approved by the FDA have remained relatively constant over the last decades, despite increasing financial input, research efforts and publication numbers.

      To explain the slowing of therapeutic innovation they consider several negative pressures acting on the field, including prior resolution of simpler research problems, increasing regulation, overreliance on reductionist approaches (including use of animal models), and the poor quality of published research. The high prevalence of irreproducible results, obscure methods, poorly designed research and publication biases are also mentioned.

      Many of these issues would greatly benefit from initiatives that promote transparency at the various stages of the therapeutic development pipeline. It has been widely acknowledged that poor reporting prevents the accurate assessment of drug and intervention efficacy. Indeed, pre-clinical research and in vivo methods have been shown to be particularly prone to biases and selective reporting of outcomes, leading to bad decision-making, wasted resources, unnecessary replication of efforts and missed opportunities for the development of effective drugs (1). One proposal to address this issue is the extension of good disclosure practice to the pre-clinical phase by conditioning publication to the registration of the pre-clinical trial prior to the commencement of the study (2). Indeed, the exact same reasons that compelled prospective registration and deposition of clinical trial results in public databases apply to preclinical studies.

      Still, no matter how transparent, well-designed, -analyzed and -reported the research, results generated from inappropriate models will not be successfully translated into valid disease contexts. Currently, most pre-clinical studies are based on the use of animal models, despite the increasing number of articles showing that they are an expensive and ineffective option to explore pathophysiological mechanisms, evaluate therapeutics, and decide on whether drug candidates should be carried forward into the clinical phase (3-5).

      Failure rates in the clinical phase are around 95% (6), mainly due to the limited power of animal studies to predict NME efficacy, safety and toxicity in humans. These predictive odds vary depending on the understanding and complexity of disease biology: while for therapeutics targeting infectious diseases success rates are higher, for diseases involving complex mechanisms, such as cancer, they can be as low as 2.3% (7). Such low predictability drains the entire system by funneling limited resources into outputs that often fail.

      In addition, false negatives at the pre-clinical stage eliminate a large part of NMEs that may have succeeded otherwise. Let us not forget Aspirin, a blockbuster drug that would not make the preclinical trial phase if tested today given its unacceptably high toxicity in animal tests. Animal-based pre-clinical phases are certainly pivotal in explaining the small number of NMEs identified in the last decades. Implementation of methods that are more faithful to the human biology is crucial for the much needed progress to ameliorate human disease and suffering.

      References

      (1) Macleod MR, 2015 Risk of bias in reports of in vivo research: a focus for improvement. PLoS Biol 13: e1002273

      (2) Kimmelman J, Anderson JA (2012). Should preclinical studies be registered? Nat Biotechnol 30: 488–489

      (3) Hartung T (2013). Food for thought: look back in anger – what clinical studies tell us about preclinical work. Altex 30: 275–291.

      (4) Sutherland BA, 2012 Neuroprotection for ischaemic stroke: translation from the bench to the bedside. Int J Stroke 7: 407–18.

      (5) Seok J, 2013 Genomic responses in mouse models poorly mimic human inflammatory diseases. PNAS 110: 3507–12.

      (6) Arrowsmith J, 2012 A decade of change. Nat Rev Drug Discov 11:17–18

      (7) Hay M, 2014 Clinical development success rates for investigational drugs. Nat Biotechnol 32 (1): 40–51.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

  2. Feb 2018
    1. On 2015 Nov 05, Cynthia Schuck-Paim commented:

      In exploring progress in biomedical research, the authors show that human life expectancy and the number of new molecular entities (NME) approved by the FDA have remained relatively constant over the last decades, despite increasing financial input, research efforts and publication numbers.

      To explain the slowing of therapeutic innovation they consider several negative pressures acting on the field, including prior resolution of simpler research problems, increasing regulation, overreliance on reductionist approaches (including use of animal models), and the poor quality of published research. The high prevalence of irreproducible results, obscure methods, poorly designed research and publication biases are also mentioned.

      Many of these issues would greatly benefit from initiatives that promote transparency at the various stages of the therapeutic development pipeline. It has been widely acknowledged that poor reporting prevents the accurate assessment of drug and intervention efficacy. Indeed, pre-clinical research and in vivo methods have been shown to be particularly prone to biases and selective reporting of outcomes, leading to bad decision-making, wasted resources, unnecessary replication of efforts and missed opportunities for the development of effective drugs (1). One proposal to address this issue is the extension of good disclosure practice to the pre-clinical phase by conditioning publication to the registration of the pre-clinical trial prior to the commencement of the study (2). Indeed, the exact same reasons that compelled prospective registration and deposition of clinical trial results in public databases apply to preclinical studies.

      Still, no matter how transparent, well-designed, -analyzed and -reported the research, results generated from inappropriate models will not be successfully translated into valid disease contexts. Currently, most pre-clinical studies are based on the use of animal models, despite the increasing number of articles showing that they are an expensive and ineffective option to explore pathophysiological mechanisms, evaluate therapeutics, and decide on whether drug candidates should be carried forward into the clinical phase (3-5).

      Failure rates in the clinical phase are around 95% (6), mainly due to the limited power of animal studies to predict NME efficacy, safety and toxicity in humans. These predictive odds vary depending on the understanding and complexity of disease biology: while for therapeutics targeting infectious diseases success rates are higher, for diseases involving complex mechanisms, such as cancer, they can be as low as 2.3% (7). Such low predictability drains the entire system by funneling limited resources into outputs that often fail.

      In addition, false negatives at the pre-clinical stage eliminate a large part of NMEs that may have succeeded otherwise. Let us not forget Aspirin, a blockbuster drug that would not make the preclinical trial phase if tested today given its unacceptably high toxicity in animal tests. Animal-based pre-clinical phases are certainly pivotal in explaining the small number of NMEs identified in the last decades. Implementation of methods that are more faithful to the human biology is crucial for the much needed progress to ameliorate human disease and suffering.

      References

      (1) Macleod MR, 2015 Risk of bias in reports of in vivo research: a focus for improvement. PLoS Biol 13: e1002273

      (2) Kimmelman J, Anderson JA (2012). Should preclinical studies be registered? Nat Biotechnol 30: 488–489

      (3) Hartung T (2013). Food for thought: look back in anger – what clinical studies tell us about preclinical work. Altex 30: 275–291.

      (4) Sutherland BA, 2012 Neuroprotection for ischaemic stroke: translation from the bench to the bedside. Int J Stroke 7: 407–18.

      (5) Seok J, 2013 Genomic responses in mouse models poorly mimic human inflammatory diseases. PNAS 110: 3507–12.

      (6) Arrowsmith J, 2012 A decade of change. Nat Rev Drug Discov 11:17–18

      (7) Hay M, 2014 Clinical development success rates for investigational drugs. Nat Biotechnol 32 (1): 40–51.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2016 Jan 08, Michael LeVine commented:

      In this recent article, Bowen and Casadevall attempt to classify biomedical research efficiency in terms of the ratio of outcomes to input. The outcomes were chosen to be approved new molecule entities (NMEs) and US life expectancy (LE); the chosen input was the NIH budget. While the resulting analysis claims that efficiency has decreased in the last decade, we argue that (i)-the analysis performed is insufficient to make that claim, and (ii)- the findings do not support the conjecture that a lack of relevance or rigor in biomedical research is causing stagnation in medicine and public health.

      Bowen and Casadevall suggest that because research projects take time to complete, it is possible that “the exponential growth in research investment and scientific knowledge over the previous five decades has simply not yet grown fruit and that a deluge of medical cures are right around the corner”. They investigate time-lagged efficiency for NMEs, but this analysis is only sufficient if there is a linear causal relationship between the two variables that is unaffected by any external variables that have not been included in the analysis. Without any evidence of such a relationship, it is unwise to interpret a trend in a ratio between two unassociated measurements. Just as two unrelated measurements can display a spurious correlation, the ratio between those measurements may display a spurious trend.

      We reanalyzed the data used in this paper to find evidence of causal relationships between the inputs and outcomes. To do this, we tested for Granger causality (1), which identifies potentially causal relationships by determining whether the time series of one variable is able to improve the forecasting of another. We analyzed the non-stationary time series from 1965-2012 using the Toda and Yamamoto method (2), which utilizes vector autoregression. We will refer to a variable X improving the forecasting of a variable Y as X ⇒ Y.

      We do not find evidence that NIH budget ⇒ NME (p=0.475), and thus it may not be a good indicator of biomedical research efficiency. However, we do find evidence that NIH budget ⇒ LE (p<10-8) and NIH budget ⇒ publications (p<machine precision). Notably, however, both VAR models utilize the maximum possible time lags (15, as selected using the Akaike information criterion (3), coincidentally the same number as used in this paper), and do not pass the Breusch-Godfrey test (4) for serially correlated residuals. As this suggests that more time lags are required to build appropriately rigorous models, it seems unwise to over-interpret the potential Granger causal relationships or make any comparisons between the Granger causality during different periods of time, until significantly more time points are available. Even with additional data, the serial correlation in the residuals might not be alleviated without the use of more complex models including non-linear terms or external variables. All three of these possible limitations also affect the analysis in this paper, but no statistical tests were performed there to assess the robustness of results.

      We conclude that this published study of biomedical research efficiency is insufficient methodologically because models of greater complexity are required. From our reanalysis, we are not able to support the hypothesis that biomedical research efficiency is decreasing. Instead, we can only conclude that from 1965-2012, the NIH budget may have a causal effect on LE and publication, but that more time points are required to improve the models.

      Another recent work (5), which aimed to study the scientific process and its efficiency, also suggested that the efficiency of biomedicinal chemistry research (defined in terms of the number of experiments that would need to be performed to discover a given fraction of all knowledge) has decreased over time. However, the analysis in (5) also suggested that even the optimal research strategy would eventually display a decrease in efficiency, due to the intrinsic increase in the difficulty of discovery as a field matures. While there are additional limitations and assumptions involved in this analysis, (5) provides an example of the level of complexity and quantitative rigor required to study research efficiency, and implies an alternative explanation for the potential reduction in biomedical research efficiency. Considering the findings in (5), we deem the hypotheses proposed in this paper, which suggests that a lack of relevance or rigor in biomedical research is causing stagnation in medicine and public health, to be unfounded.

      We write this comment in part because unfounded defamatory claims directed at the scientific community are dangerous in that they may negatively affect the future of scientific funding. Such claims should not be made lightly, and the principle of parsimony should be invoked when less defamatory alternative hypotheses are available.

      Michael V. LeVine and Harel Weinstein

      (1) Granger CWJ (1969) Investigating Causal Relations by Econometric Models and Cross-spectral Methods. Econometrica 37(3):424–438.

      (2) Toda HY, Yamamotob T (1995) Statistical inference in vector autoregressions with possibly integrated processes. J Econom 66:225–250.

      (3) Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723.

      (4) Breusch TS (1978) Testing for autocorrelation in dynamic linear models. Aust Econ Pap 17(31):334–355.

      (5) Rzhetsky A, Foster JG, Foster IT, Evans JA (2015) Choosing experiments to accelerate collective discovery. Proc Natl Acad Sci:201509757.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.