62 Matching Annotations
  1. Feb 2024
    1. Although some of the EverForecast species models provide output on a daily time scale, water managers and scientists in the Everglades typically evaluate hydrologic conditions and make recommendations on a weekly to monthly time scale. To target managers’ needs, we summarized EverForecast outputs on a biweekly (14 day) time step.

      Time scale of management needs for Everglades forecasting

  2. Oct 2023
    1. https://web.archive.org/web/20231019053547/https://www.careful.industries/a-thousand-cassandras

      "Despite being written 18 months ago, it lays out many of the patterns and behaviours that have led to industry capture of "AI Safety"", co-author Rachel Coldicutt ( et Anna Williams, and Mallory Knodel for Open Society Foundations. )

      For Open Society Foundations by 'careful industries' which is a research/consultancy, founded 2019, all UK based. Subscribed 2 authors on M, and blog.

      A Thousand Cassandras in Zotero.

  3. Aug 2023
    1. forecasts of the magnitude and longevity of the event were very good; although forecasted high-temperature records fell 1–3 °C short of the observed highs in many cases.
      • for: meteorology, forecasting, ensemble forecasting, Pacific Northwest heatwave
      • paraphrase
        • forecasts of the magnitude and longevity of the event were very good;
        • although forecasted high-temperature records fell 1–3 °C short of the observed highs in many cases.
    2. Meteorologists are typically reluctant to make extreme weather forecasts at forecast horizons of around a week for fear of “crying wolf” and the associated reduction in end-user trust. In this case, however, the ensemble forecast provided sufficient certainty that meteorologists were able to warn of “extreme” heat at this relatively long-lead time—a testament to ensemble forecast technology.
      • for meteorology, forecasting, ensemble forecasting
      • paraphrase
        • Meteorologists are typically reluctant to make extreme weather forecasts at forecast horizons of around a week
          • for fear of “crying wolf” and
          • the associated reduction in end-user trust.
        • In this case, however, the ensemble forecast provided sufficient certainty that meteorologists were able to warn of “extreme” heat at this relatively long-lead time
        • a testament to ensemble forecast technology.
  4. Dec 2021
  5. Sep 2021
    1. Bracher, J., Wolffram, D., Deuschel, J., Görgen, K., Ketterer, J. L., Ullrich, A., Abbott, S., Barbarossa, M. V., Bertsimas, D., Bhatia, S., Bodych, M., Bosse, N. I., Burgard, J. P., Castro, L., Fairchild, G., Fuhrmann, J., Funk, S., Gogolewski, K., Gu, Q., … Xu, F. T. (2021). A pre-registered short-term forecasting study of COVID-19 in Germany and Poland during the second wave. Nature Communications, 12(1), 5173. https://doi.org/10.1038/s41467-021-25207-0

  6. Jan 2021
  7. Oct 2020
    1. Affective forecasting is the process by which we attempt to pre-dict how we will feel in the future. One of the ways we fail at this task is called the end of history illusion,which suggests that we’re well aware of how much we’ve changed in the past ten years, but we imagine that that’s it—we’re done changing. When asked how much we think we’ll change in the next ten years, we assume we’re done.
  8. Sep 2020
  9. Aug 2020
  10. Jun 2020
  11. May 2020
  12. Nov 2018
    1. This reflects a fundamental property of EDM in that forecast performance depends solely on the information content of the data rather than on how well assumed equations match reality.To clarify the concept of nonuniqueness, consider the canonical Lorenz attractor (SI Appendix, Fig. S1A). The behavior of this system is governed by three differential equations (SI Appendix, Eq. S1). However, the axes can be rotated to produce three new coordinates, x′, y′, and z′, and the equations rewritten in terms of these new coordinates, allowing the system to be described using either representation (x, y, and z or x′, y′, and z′) as well as mixed combinations (e.g., x, y, and z′). Thus, with an infinite number of ways to rotate the system, there are an unlimited number of “true variables” and “true models.” In the case of sockeye salmon, the similar performance of different models (SI Appendix, Table S4) does not mean that one or the other model is incorrect; instead, it reflects the fact that the environmental variables are indicators of the same general mechanism, and so different variable combinations can be equally informative for forecasting recruitment.Again, we emphasize that including a variable does not imply a direct causal link—variables in an EDM model improve forecasts because they are informative; it does not mean that the included variables are proximate causes. Importantly, the converse does not hold either: a variable could be causal and yet not appear in the multivariate EDM; this might occur when multiple stochastic drivers affect recruitment in an interdependent way, necessitating that a model include measurements of all of the drivers to account for their combined effect. For example, although none of the tested variables seem to improve forecasts for the Birkenhead stock (SI Appendix, Table S4), this does not mean that these sockeye salmon are insensitive to SST, river discharge, and the PDO. Rather, it suggests that the effect of these variables may be modulated by other factors not considered here.

      This distinction between direct causality and information content is a really useful perspective even beyond EDM.

  13. Jul 2018
  14. Dec 2017
    1. Feedback mechanisms provide stability such that ecosystems appear stable during some time frames but can abruptly shift to express new structures in others (9)

      We need to understand how frequently these kinds of change happen in order to understand the potential for forecasting and the best kinds of models for approaching it.

  15. Nov 2017
    1. One of the primary uses of a model like this one is to improve the conversation between stakeholders and managers. The model can be valuable in helping managers and citizens arrive at realistic goals and to realize that there will be inherent risks associated with meeting those goals. For example, our analysis shows that reducing the probability of transmission by one half in five years using vaccination is not likely when we include uncertainty in the ability of managers to treat a targeted number of seronegative females. Forecasts suggested that there was virtually no chance of meeting that goal (Table 12). Similarly there was a 7% chance of reducing adult female seroprevalence below 40% using vaccination. We can nonetheless use this work to articulate what level of brucellosis suppression is feasible given current technology. For example, managers and stakeholders might agree that it is enough to be moving in the right direction with efforts to reduce risk of infection from brucellosis. In this case, a reasonable goal might be “Reduce the probability of exposure by 10% relative to the current median value.” The odds of meeting that goal using vaccination increased to 26%. With this less ambitious goal, vaccination increases the probability that the goal would be met relative to no action by a factor of only 1.4. This illustrates a fundamental trade-off in making management choices in the face of uncertainty: less ambitious goals are more likely to be met, but they offer smaller improvements in the probability of obtaining the desired outcome relative to no action.

      Great description of the value of forecasting models for improving conversations between stakeholders and managers in the development of goals and expectations of outcomes.

    2. We show that these uncertainties combine to assure that long-term predictions, e.g., 20 years in Peterson et al. (1991); 35 years in Ebinger et al. (2011); 30 years in Treanor et al. (2010) will be unreliable because credible intervals on forecasts expand rapidly with increases in the forecast horizon (Table 9). Long-range forecasts will include an enormous range of probable outcomes. This finding urges caution in making long-term forecasts with ecological models.

      Cautionary note on making long-term forecasts with ecological models due to decreased accuracy with forecast horizon. This issue is made clear through the proper inclusion of uncertainty in the models.

    3. Evaluation of alternatives proceeded in three steps. We first obtained the posterior process distribution of the state at some point in the future, given no action, and calculated the probability that the goal will be met (Fig. 3A). The no-action alternative can be considered a null model to which alternative actions can be compared. Next, we approximated the posterior process distribution at the same point in the future assuming that we have implemented an alternative for management and calculated the probability that the goal will be met (Fig. 3B). Finally, we calculated the ratio of the probability of meeting our goal by taking action over the probability if we take no action. This ratio quantifies the net effect of management (Fig. 3C) and permits statements such as “Taking the proposed action is five times more likely to reduce seroprevalence below 40% relative to taking no action.”This process for evaluating alternative actions explicitly incorporates uncertainties in the future state of the population in the presence and absence of management. A useful feature of this approach is that the weight of evidence for taking action diminishes as the uncertainty in forecasts increases. That is, increasing uncertainty in forecasts compresses the hatched area in Fig. 3C. This result encourages caution in taking action. Also useful is the inverse relationship between the absolute probability that a goal will be met by management and the probability that it will be met relative to taking no action. As the ambition of objectives increases (e.g., the dashed line in Fig. 3 moves to the left), the absolute probability that the management action will be achieved declines (the hatched area in Fig. 3B shrinks), but the probability of success relative to taking no action increases (the hatched area in Fig. 3C expands). This feature represents a fundamental trade-off in choosing goals and actions that are present in all management decisions: objectives that are not ambitious are easy to meet by applying management, but they might be met almost as easily by taking no action.

      This is an exemplar of how to use complex process oriented models to inform the value of management decisions.

    4. The model omits covariates describing weather conditions, e.g., drought severity, which have been included in other models of bison population dynamics in Yellowstone (Fuller et al. 2007a). We justify this omission because our central objective was to develop a forecasting model. We we use the term forecast to mean predictions of future states accompanied by coherent estimates of uncertainty arising from the failure of the model to represent all of the influences that shape the population's future trajectory.
      1. Another nice example of needing to make choices about what complexity to include in the model.
      2. An example of an explicit choice to avoid including environmental factors since they themselves would have to be forecast.
    5. The model is not spatially explicit. Although there is evidence that the population is made up of two different herds that spend their summers in the northern and central portions of Yellowstone National Park (Olexa and Gogan 2007), we justify our decision to treat the population without spatial structure as a first approximation of its behavior and because recent evidence suggests that substantial movement between herds occurs annually (Gates et al. 2005, Fuller et al. 2007, White and Wallen 2012).

      One of the things I really like about this paper is that it highlights that no matter how many statistical complexities are included in a model there are always more that could be. It isn't tractable to include them all so you choose the ones you think are most important based on available evidence and your professional judgement.

  16. Aug 2017
    1. Thus, predicting species responses to novel climates is problematic, because we often lack sufficient observational data to fully determine in which climates a species can or cannot grow (Figure 3). Fortunately, the no-analog problem only affects niche modeling when (1) the envelope of observed climates truncates a fundamental niche and (2) the direction of environmental change causes currently unobserved portions of a species' fundamental niche to open up (Figure 5). Species-level uncertainties accumulate at the community level owing to ecological interactions, so the composition and structure of communities in novel climate regimes will be difficult to predict. Increases in atmospheric CO2 should increase the temperature optimum for photosynthesis and reduce sensitivity to moisture stress (Sage and Coleman 2001), weakening the foundation for applying present empirical plant–climate relationships to predict species' responses to future climates. At worst, we may only be able to predict that many novel communities will emerge and surprises will occur. Mechanistic ecological models, such as dynamic global vegetation models (Cramer et al. 2001), are in principle better suited for predicting responses to novel climates. However, in practice, most such models include only a limited number of plant functional types (and so are not designed for modeling species-level responses), or they are partially parameterized using modern ecological observations (and thus may have limited predictive power in no-analog settings).

      Very nice summary of some of the challenges to using models of contemporary species distributions for forecasting changes in distribution.

    2. In eastern North America, the high pollen abundances of temperate tree taxa (Fraxinus, Ostrya/Carpinus, Ulmus) in these highly seasonal climates may be explained by their position at the edge of the current North American climate envelope (Williams et al. 2006; Figure 3). This pattern suggests that the fundamental niches for these taxa extend beyond the set of climates observed at present (Figure 3), so that these taxa may be able to sustain more seasonal regimes than exist anywhere today (eg Figure 1), as long as winter temperatures do not fall below the −40°C mean daily freezing limit for temperate trees (Sakai and Weiser 1973).

      Recognizing where species are relative to the observed climate range will be important for understanding their potential response to changes in climate. This information should be included when using distribution models to predict changes in species distributions. Ideally this information could be used in making point estimates, but at a minimum understanding its impact on uncertainty would be a step forward.

  17. Jan 2017
    1. We also did not allow different portions of our study area to respond to climate in different ways. Doing so would require spatially varying climate effects and a substantial increase in computational time. However, in future applications, it will be important to allow climate effects to vary over space to better capture reality. Conn et al. (2015) provide examples of how such spatiotemporal interactions can be included in abundance models. We might expect climate effects to interact with spatial covariates such as soil type, slope, and aspect.

      Interesting point about the potential importance of spatiotemporal interactions.

    2. To simulate equilibrium sagebrush cover under projected future climate, we applied average projected changes in precipitation and temperature to the observed climate time series. For each GCM and RCP scenario combination, we calculated average precipitation and temperature over the 1950–2000 time period and the 2050–2098 time period. We then calculated the absolute change in temperature between the two time periods (ΔT) and the proportional change in precipitation between the two time periods (ΔP) for each GCM and RCP scenario combination. Lastly, we applied ΔT and ΔP to the observed 28-year climate time series to generate a future climate time series for each GCM and RCP scenario combination. These generated climate time series were used to simulate equilibrium sagebrush cover.

      This is an interesting approach to forecasting future climate values with variation.

      1. Use GCMs to predict long-term change in climate condition
      2. Add this change to the observed time-series
      3. Simulate off of this adjusted time-series

      Given short-term variability may be important, that it is not the focus of the long-term GCM models, and that the goal here is modeling equilibrum (not transitional) dynamics, this seems like a nice compromise approach to capture both long-term and short-term variation in climate.

    3. Our process model (in Eq. (2)) includes a log transformation of the observations (log(yt − 1)). Thus, our model does not accommodate zeros. Fortunately, we had very few instances where pixels had 0% cover at time t − 1 (n = 47, which is 0.01% of the data set). Thus, we excluded those pixels from the model fitting process. However, when simulating the process, we needed to include possible transitions from zero to nonzero percent cover. We fit an intercept-only logistic model to estimate the probability of a pixel going from zero to nonzero cover: yi∼Bernoulli(μi)(8)logit(μi)=b0(9)where y is a vector of 0s and 1s corresponding to whether a pixel was colonized (>0% cover) or not (remains at 0% cover) and μi is the expected probability of colonization as a function of the mean probability of colonization (b0). We fit this simple model using the “glm” command in R (R Core Team 2014). For data sets in which zeros are more common and the colonization process more important, the same spatial statistical approach we used for our cover change model could be applied and covariates such as cover of neighboring cells could be included.

      This seems like a perfectly reasonable approach in this context. As models like this are scaled up to larger spatial extents the proportion of locations with zero abundance will increase and so generalizing the use of this approach will require a different approach to handling zeros.

    4. Our approach models interannual changes in plant cover as a function of seasonal climate variables. We used daily historic weather data for the center of our study site from the NASA Daymet data set (available online: http://daymet.ornl.gov/). The Daymet weather data are interpolated between coarse observation units and capture some spatial variation. We relied on weather data for the centroid of our study area.

      This seems to imply that only a single environmental time-series was used across all of the spatial locations. This is reasonable given the spatial extent of the data, but it will be necessary to allow location specific environmental time-series to allow this to be generalized to large spatial extents.

    5. Because SDMs typically rely on occurrence data, their projections of habitat suitability or probability of occurrence provide little information on the future states of populations in the core of their range—areas where a species exists now and is expected to persist in the future (Ehrlén and Morris 2015).

      The fact that most species distribution models treat locations within a species range as being of equivalent quality for the species regardless of whether there are 2 or 2000 individuals of that species is a core weakness of the occupancy based approach to modeling these problems. Approaches, like those in this paper, that attempt to address this weakness are really valuable.

  18. Dec 2016
    1. Our abilities to make observations are limited to a small range of space and time scales (8), limiting our capacity for understanding ecosystems and forecasting how they will respond to local and global change.

      Our abilities to manage natural systems are also typically limited to a small range of space and time scales.

    2. A range of information sources, which can include models, is used to develop alternative plausible trajectories of ecosystems; uncertainties about the future are represented by the range of conditions captured by the ensemble of scenarios. In contrast, forecasts narrowly limit uncertainties to those associated with a single potential outcome that is assumed to be predictable

      This strong distinction between "forecasts" and "scenarios" seems like a rather arbitrary distinction on the surface. There are forecasting approaches that attempt to account for uncertainty in a broad array of things including uncertainty in the generating model. Many of the examples in Principles of Forecasting by J. Scott Armstrong are what would be described as "scenario" based approaches here. Likewise some of the approaches employed by forecasters in Superforecasting by Tetlock & Gardner involve developing a range of scenarios.

      Scenarios in general need to have a reasonable probability of occurrence to be usefully included in decision making. So at least at some minimum threshold it a probability is being associated with scenarios. Going one step further and assigning a probability to each member of a set of scenarios would result in a probabilistic forecast.

      In short, it seems to me that scenario development is, in many cases, a kind of forecasting. It may involve large uncertainties and it may currently be associated with different kinds of decision making, like choosing management practices that are robust to may possible models, but these can both be accomplished in other ways. Using language that implies that these are completely distinct approaches seems likely to cause confusion and unnecessary terminological debate.

  19. Nov 2016
    1. Practices in the field of financial investing provide a good analogy to the stance we suggest for ecological predictions. A great deal of money and effort has been used to model the best ways to maximize investment returns (certainly more money and effort than has been used to refine ecological predictions). Although this work has resulted in greatly increased understanding of economic systems, the risks and limitations of using sophisticated economic models to make investments has led more and more investors to instead use simple, safe index funds. Essentially this is the recognition that the models and expert opinions are of exceptionally little value in making accurate, long-range predictions in this field and that precautionary strategies are a far better alternative.

      The market is quite different from ecology in the sense that it responds to the predictions/forecasts that are made about it. The idea of the "efficient market" is one of the reasons why modeling the market is believed to be inherently difficult.

      It is also worth noting that investing in index funds is based on a simple model, that the market always increases at rates greater than inflation in the long run. In other words, an inclination towards index funds suggests that some aspects of the market are forecastable at some time-scales. Paying attention to what aspects of ecological systems are less susceptible to surprises (and at what scales) would be a useful route forward.

    1. Scenarios were initially developed by Herbert Kahn in response to the difficulty of creating accurate forecasts ( Kahn & Wiener 1967; May 1996 ). Kahn worked at the RAND Corporation, an independent research institute with close ties to the U.S. military. He produced forecasts based on several constructed scenarios of the future that differed in a few key assumptions ( Kahn & Wiener 1967 ). This approach to scenario planning was later elaborated upon at SRI International ( May 1996 ), a U.S. research institute, and at Shell Oil (  Wack 1985a, 1985b, Schwartz 1991; Van der Heijden 1996 ).

      Interesting information on the history of scenario based forecasting.

    2. Prediction means different things to different technical disciplines and to different people ( Sarewitz et al. 2000 ). A reasonable definition of an ecological prediction is the probability distribution of specified ecological variables at a specified time in the future, conditional on current conditions, specified assumptions about drivers, measured probability distributions of model parameters, and the measured probability that the model itself is correct ( Clark et al. 2001 ). A prediction is understood to be the best possible estimate of future conditions. The less sensitive the prediction is to drivers the better ( MacCracken 2001 ). Whereas scientists understand that predictions are conditional probabilistic statements, nonscientists often understand them as things that will happen no matter what they do ( Sarewitz et al. 2000; MacCracken 2001 ).In contrast to a prediction, a forecast is the best estimate from a particular method, model, or individual. The public and decision-makers generally understand that a forecast may or may not turn out to be true ( MacCracken 2001 ). Environmental scientists further distinguish projections, which may be heavily dependent on assumptions about drivers and may have unknown, imprecise, or unspecified probabilities. Projections lead to “if this, then that” statements ( MacCracken 2001 ).

      This distinction between "prediction" and "forecast" is not something I've generally seen in either the ecological forecasting literature or the forecasting literature more generally. This use is backed only by a citation to a guest editorial in a zine (think newsletter), so while I appreciate the need to be clear about uncertainty I don't think this treatment of the terminology is a particularly effective way to accomplish this.

    1. We hypothesize that precipitation levels during the preceding wet season and during the onset of the dry season in forests of Southern Hemisphere South America act as a key regulator of drought intensity during the subsequent dry season.

      The hypothesized causal pathway through precipitation is not directly on the precip during the peak fire period, but through factors that are influenced on longer time-scales.

      This is advantageous for forecasting because it means that the strongest signals for forecasting occur several months in advance of the desired forecast. If the strongest response was more proximate it would make the advanced forecasts weaker as they would be relying on the earlier month SST values as estimates of the closer in values that could in concept produce stronger forecasts.

    2. We defined our empirical predictive model as a linear combination of the two climate indices sampled during these months of maximum correlation:FSSpredicted(x,t)=a(x)×ONI[t,m(x)−τONI(x)]+b(x)×AMO[t,m(x)−τAMO(x)]+c(x)

      If I'm reading this and the supplemental material correctly this is a two step modeling approach applied independently to each region.

      1. For each region identify the month of ONI and AMO that are most tightly correlated with FSS.
      2. Use the values of ONI and AMO for the selected months to build a two-variable multiple regression.

      If this is right it seems like the model could be improved further by incorporating lagging methods like https://doi.org/10.1111/ele.12399 and by making the model spatially explicit so that neighboring regions can borrow strength from one another. However the ability to to perform these kinds of approaches may be necessarily limited by the small size of the available training set.

    3. we used 2001–2009 fire counts detected by the Moderate Resolution Imaging Spectroradiometer (MODIS)

      The success of this model with only small amounts of training data is encouraging for other areas of ecology and environmental science where the available time-series may be short.

    4. Fire season severity, here defined as the sum of satellite-based active fire counts in a 9-month period centered at the peak fire month, depends on multiple parameters that influence fuel moisture levels and fire activity in addition to precipitation, including vapor pressure deficits, wind speeds, ignition sources, land use decisions, and the duration of the dry season. As a result, the relationship between FSS and SSTs may be more complex than the relationships between precipitation and SSTs described above.

      This recognition of additional factors that could influence fire, and the fact it more complex models using the same data may be able to indirectly use some of these influences is really valuable. It is, in effect, positing that latent variables associated with some of these causes may be associated with measurable aspects of SST.

    5. This is a nice example of chaining together separate pieces of knowledge to understand what form of forecasting model might be successful. Large scale climate phenomena -> variation in precipitation -> variation in fire season severity.

    1. To predict FSS at or before the beginning of the fire season, we established a cutoff (minimum) lead time of 3 month

      It would be interesting to know how certainty in the results continued to improve as the last few months of data became available. If the improvements where substantial it could justify consideration of shifting policy to more last minute shifts in resources.

    1. My thoughts on Climatic Associations of British Species Distributions Show Good Transferability in Time but Low Predictive Accuracy for Range Change by Rapacciuolo et al. (2012).

    2. Whilst the consensus method we used provided the best predictions under AUC assessment – seemingly confirming its potential for reducing model-based uncertainty in SDM predictions [58], [59] – its accuracy to predict changes in occupancy was lower than most single models. As a result, we advocate great care when selecting the ensemble of models from which to derive consensus predictions; as previously discussed by Araújo et al. [21], models should be chosen based on aspects of their individual performance pertinent to the research question being addressed, and not on the assumption that more models are better.

      It's interesting that the ensembles perform best overall but more poorly for predicting changes in occupancy. It seems possible that ensembling multiple methods is basically resulting in a more static prediction, i.e., something closer to a naive baseline.

    3. Finally, by assuming the non-detection of a species to indicate absence from a given grid cell, we introduced an extra level of error into our models. This error depends on the probability of false absence given imperfect detection (i.e., the probability that a species was present but remained undetected in a given grid cell [73]): the higher this probability, the higher the risk of incorrectly quantifying species-climate relationships [73].

      This will be an ongoing challenge for species distribution modeling, because most of the data appropriate for these purposes is not collected in such a way as to allow the straightforward application of standard detection probability/occupancy models. This could potentially be addressed by developing models for detection probability based on species and habitat type. These models could be built on smaller/different datasets that include the required data for estimating detectability.

    4. an average 87% of grid squares maintaining the same occupancy status; similarly, all climatic variables were also highly correlated between time periods (ρ>0.85, p<0.001 for all variables). As a result, models providing a good fit to early distribution records can be expected to return a reasonable fit to more recent records (and vice versa), regardless of whether relevant predictors of range shift have actually been captured. Previous studies have warned against taking strong model performance on calibration data to indicate high predictive accuracy to a different time period [20], [24]–[26]; our results indicate that strong model performance in a different time period, as measured by widespread metrics, may not indicate high predictive accuracy either.

      This highlights the importance of comparing forecasts to baseline predictions to determine the skill of the forecast vs. the basic stability of the pattern.

    5. Most variation in the prediction accuracy of SDMs – as measured by AUC, sensitivity, CCRstable, CCRchanged – was among species within a higher taxon, whilst the choice of modelling framework was as important a factor in explaining variation in specificity (Table 4 and Table S4). The effect of major taxonomic group on the accuracy of forecasts was relatively small.

      This suggests that it will be difficult to know if a forecast for a particular species will be good or not, unless a model is developed that can predict which species will have what forecast qualities.

    6. The correct classification rate of grid squares that remained occupied or remained unoccupied (CCRstable) was fairly high (mean±s.d.  = 0.75±0.15), and did not covary with species’ observed proportional change in range size (Figure 3B). In contrast, the CCR of grid squares whose occupancy status changed between time periods (CCRchanged) was very low overall (0.51±0.14; guessing randomly would be expected to produce a mean of 0.5), with range expansions being slightly better predicted than range contractions (0.55±0.15 and 0.48±0.12, respectively; Figure 3C).

      This is a really important result and my favorite figure in this ms. For cells that changed occupancy status (e.g., a cell that has occupied at t_1 and was unoccupied at t_2) most models had about a 50% chance of getting the change right (i.e., a coin flip).

    7. The consensus method Mn(PA) produced the highest validation AUC values (Figure 1), generating good to excellent forecasts (AUC ≥0.80) for 60% of the 1823 species modelled.

      Simple unweighted ensembles performed best in this comparison of forecasts from SDMs for 1823 species.

    8. Quantifying the temporal transferability of SDMs by comparing the agreement between model predictions and observations for the predicted period using common metrics is not a sufficient test of whether models have actually captured relevant predictors of change. A single range-wide measure of prediction accuracy conflates accurately predicting species expansions and contractions to new areas with accurately predicting large parts of the distribution that have remained unchanged in time. Thus, to assess how well SDMs capture drivers of change in species distributions, we measured the agreement between observations and model predictions of each species’ (a) geographic range size in period t2, (b) overall change in geographic range size between time periods, and (c) grid square-level changes in occupancy status between time periods.

      This is arguably the single most important point in this paper. It is equivalent to comparing forecasts to simple baseline forecasts as is typically done in weather forecasting. In weather forecasting it is typical to talk about the "skill" of the forecast, which is how much better it does than a simple baseline. In this case the the baseline is a species range that doesn't move at all. This would be equivalent to a "naive" forecast in traditional time-series analysis since we only have a single previous point in time and the baseline is simply the prediction based on this value not changing.

    9. Although it is common knowledge that some of the modelling techniques we used (e.g., CTA, SRE) generally perform less well than others [32], [33], we believe that their transferability in time is not as well-established; therefore, we decided to include them in our analysis to test the hypothesis that simpler statistical models may have higher transferability in time than more complex ones.

      The point that providing better/worse fits on held out spatial training data is not the same was providing better forecasts is important especially given the argument about simpler models having better transferability.

    10. We also considered including additional environmental predictors of ecological relevance to our models. First, although changes in land use have been identified as fundamental drivers of change for many British species [48]–[52], we were unable to account for them in our models – like most other published accounts of temporal transferability of SDMs [20], [21], [24], [25] – due to the lack of data documenting habitat use in the earlier t1 period; detailed digitised maps of land use for the whole of Britain are not available until the UK Land Cover Map in 1990 [53].

      The lack of dynamic land cover data is a challenge for most SDM and certainly for SDM validation using historical data. If would be interesting to know, in general, how much better modern SDMs become based on held out data when land cover is included.

    11. Great Britain is an island with its own separate history of environmental change; environmental drivers of distribution size and change in British populations are thus likely to differ somewhat from those of continental populations of the same species. For this reason, we only used records at the British extent to predict distribution change across Great Britain.

      This restriction to Great Britain for the model building is a meaningful limitation since Great Britain will typically represent a small fraction of the total species range for many of the species involved. However this is a common issue for SDMs and so I think it's a perfectly reasonable choice to make here given the data availability. It would be nice to see this analysis repeated using alternative data sources that cover spatial extents closer to that of the species range. This would help determine how well these results generalize to models built at larger scales.

    12. (1) Are climate-based SDMs transferable in time? (2) Can they capture drivers of expansion and contraction of species geographic ranges? (3) What is the relative effect of methodological and taxonomic variation on prediction accuracy?

      These are three of the crucial questions that need to be answered about the performance of SDMs for forecasting. To this list I would add:

      (4) Are the uncertainties associated with SDM forecasts accurate?

    13. Unfortunately, assessing whether they do is notoriously difficult since their main aim is to predict events that are yet to occur [20]; most studies thus measure the transferability of their models using a subset or re-sampled set of the distribution records used to build the models, a limited approach that can greatly inflate estimates of predictive accuracy [20]. For this reason, an emerging approach for estimating the true transferability of SDMs has been to validate model predictions against independent field records documenting shifts in species distributions to novel time periods [20]–[26] and regions [27]–[31]. However, published accounts of such independent model validation have generally lacked methodological or taxonomic breadth.

      The relatively small number of efforts to determine how predictive SDMs are for the future state of species distributions remains an ongoing issue in 2016. This kind of work is crucial to understanding the biases and uncertainty associated with current approaches to distribution modeling.

    1. Despite the absence of mechanistic information about the underlying ecological processes, the relatively simple SSR method consistently outperformed the control models over near-term prediction horizons. This result was robust across all simulations and all life stages of the experimental data. Moreover, the SSR model achieved this feat using only a single time series, whereas the control model used all times series simultaneously (it is an ecologically unrealistic scenario to assume we know the model and have time series for all of the relevant variables). Other analyses have shown that multivariate SSR methods (35) improve with additional information (14, 36), suggesting that the performance of the SSR model tested here represents a lower bound on forecast accuracy attainable with this general approach.

      Readers of this paper should also take a look at the exchange following this paper when interpreting the overall results:

      http://www.pnas.org/content/110/42/E3975.full http://www.pnas.org/content/110/42/E3976.full

      In short, Hartig & Dormann (2013) show that using methods designed to improve fitting under chaotic conditions that the control model outperforms other models in the case of the logistic model. Perretti et al. respond by showing an example where these methods fail and argue that the true model is never known anyway.

    2. In contrast, the linear forecasting methods were often no better than a prediction of the mean of the test set (represented by SRMSE = 1; Figs. 1 and 2). The performance of the control model varied depending on the system, but it too was often no better than a mean predictor.

      This comparison to the mean of the test set is a nice example of using a baseline to contextualize the strength of the forecast. This is commonly done in weather forecasting, in which case accuracy relative to the mean or other baseline is described as "skill".

    3. To emulate ecologically realistic conditions, we added log-normal observation error with a CV of 0.2 to all of the training sets. The forecast accuracy was evaluated using the test-set time series without observation error.

      It would be interesting to know the results in the absence of this additional error. This low-error scenario seems likely to benefit the deterministic models.

    4. All control models included log-normal process and observation error, and parameter values that resulted in chaotic or near-chaotic deterministic dynamics.

      This is an important choice for this study because:

      1. It makes the job of fitting the models more challenging
      2. It makes forecasting from the deterministic process in the presence of error inherently challenging
      3. The SSR method is designed for chaotic data
    5. In real systems we often lack time series of important driving variables or species; thus, for this additional reason, this comparison represents a very optimistic scenario for the mechanistic modeling approach.

      This is a really important point regarding the potential strength of time-series only forecasting model (i.e., those not using external co-variates). The error resulting from uncertainty in the forecast of the external driver, and its propagation into the resulting forecasts of the focal outcome, may (in some cases) overwhelm any benefits derived from having the external driver. This won't be true if the external driver is important, and relatively forecastable, but it is worth keeping in mind when comparing forecasting methods.

  20. Jul 2016
    1. In April 1950, Charney’s group made a series of successful 24-hour forecasts over North America, and by the mid-1950s, numerical forecasts were being made on a regular basis.

      Roughly 50 years from initial efforts to first successful forecasts.

    2. Charney determined that the impracticality of Richardson’s methods could be overcome by using the new computers and a revised set of equations, filtering out sound and gravity waves in order to simplify the calculations and focus on the phenomena of most importance to predicting the evolution of continent-scale weather systems.

      The complexity of the forecasting problem was initially overcome in the 1940's both by an improved rate of calculation (using computers) and by simplifying the models to focus on the most important factors.

    3. Courageously, Richardson reported his results in his book Weather Prediction by Numerical Process, published in 1922.

      Despite failing to predict the weather accurately, Richardson posted his results publicly. This is an important step in allowing the improvement of forecasting because it makes it possible to learn what works and what doesn't more quickly. See also Brian McGill's 6th P of Good Prediction

    4. Despite the advances made by Richardson, it took him, working alone, several months to produce a wildly inaccurate six-hour forecast for an area near Munich, Germany. In fact, some of the changes predicted in Richardson’s forecast could never occur under any known terrestrial conditions.

      Nice concise description of the poor performance and impracticality of early weather forecasting.

  21. Jun 2016
    1. Imagine a rotating sphere that is 12,800 kilometers (8000 miles) in diameter, has a bumpy surface, is surrounded by a 40-kilometer-deep mixture of different gases whose concentrations vary both spatially and over time, and is heated, along with its surrounding gases, by a nuclear reactor 150 million kilometers (93 million miles) away. Imagine also that this sphere is revolving around the nuclear reactor and that some locations are heated more during one part of the revolution and other locations are heated during another part of the revolution. And imagine that this mixture of gases continually receives inputs from the surface below, generally calmly but sometimes through violent and highly localized injections. Then, imagine that after watching the gaseous mixture, you are expected to predict its state at one location on the sphere one, two, or more days into the future. This is essentially the task encountered day by day by a weather forecaster (Ryan, Bulletin of the American Meteorological Society,1982).

      This is a great short description of the complexity of weather forecasting. It is interesting to think about how complex this problem is, and yet how simple it is compared to forecasting in other systems, like ecology, economics, etc.