On 2016 Feb 10, Lydia Maniatis commented:
Part 2 DATA, MODEL, AND MODEL-FITTING
The authors propose to “quantify how well summary statistics like averages are calculated [using] an Equivalent Noise (Nagaraja, 1964; Pelli, 1990; Dakin, 2001) framework...” (p. 1). The first two references discuss “luminance noise” and contrast thresholds. The mathematical framework and supporting arguments seems chiefly provided by the Pelli, (1990) reference (Dakin, 2001, takes the applicability of the Equivalent Noise paradigm to orientation for granted). However, the conclusion of the Pelli chapter includes the following statements: "It will be important to test the model of Fig. 1.7 [the most general, schematic expression of the proposed model – which nevertheless refers specifically to contrast-squared)]. For gratings in dynamic white noise, [the main prediction of the model] has been confirmed by Pelli (1981), disconfirmed by Kersten (1984) and reconfirmed by Thomas (1985). More work is warranted.” (p. 18).
Also, Pelli's arguments seem to overlook basic facts of vision, such as the inhibitory mechanisms at the retinal level. Has his model actually been tested in the 25 years since the chapter was written, with respect to contrast, with respect to orientation? Where are the supporting references? (It is worth noting that Pelli seems to be unfamiliar with the special significance of “disconfirmations,” i.e. falsifications, in the testing of scientific hypotheses. Newton's theory has been confirmed many times, and can continue to be confirmed indefinitely, but it stopped being an acceptable theory after the falsification of a necessary prediction).
Agnostic as to the perceptual abilities, processes or functional mechanisms underlying observer performance (the method confounds perception, attention and cognition), and assuming that a “just-noticeable contrast level” is computationally interchangable with a “just-comparable angle (via "averaging")," the authors proceed to fit the data to a mathematical model.
From data points at two locations on the x-axis, they construct non-linear curves, which differ significantly from observer to observer. If the curves mean anything at all, they predict performance at intermediate levels of x-axis values - unless we are required to assume a priori that the model makes accurate predictions (in which case it is a metaphysical, not an empirical, model). The problem, as mentioned above, is that there is high inter-observer variability, such that the curves differ significantly from one observer to the next. (I also suspect that there was high intra-observer variability, though this statistic is not reported. ). Thus, a test of the model predictions for intermediate x-values would seem to require that we retest the same observers at new levels of the independent variable. (Why weren't observers tested with at least one more x-value?) I'm not at all sure that the results would confirm the predictions, but even if they did, this is supposed to be a general model. So what if we wanted to test it on new observers at new, intermediate levels of the independent variable? How would the investigators arrive at their predictions for this case?
If there are no criteria for testing (i.e. potentially rejecting) the model - if any two data points can always be - can ONLY be - fitted post hoc - then this type of model-fitting exercise lies outside the domain of empirical science.
It is always possible to compare “models” purporting to answer the wrong question, to investigate a nonexistent phenomenon. To use a rough example, we could ask,”Is the Sun's orbit around the Earth more consistent with a circular or an elliptical model?” Using the apparent movements of the Sun in relation to the Earth and other cosmic landmarks, we could compare models and conclude that one of them “better fits” the data or that it "fits the data well" (it's worth noting that the model being fitted here has dozens of free parameters: "The model with the fewest parameters had 55 free parameters" (p. 5)). But this wouldn't amount to a theoretical advance. I think that this is the kind of thing going on here.
Asking what later turns out to be the wrong question is par for the course in science, and excusable if you have a solid rationale consistent with knowledge at the time. Here, this does not seem to be the case.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.