- Jul 2018
-
europepmc.org europepmc.org
-
On 2015 Dec 28, Lydia Maniatis commented:
The authors reject the idea that the perception of “brightness” is contingent on the segmentation of the visual field into “luminance patches,” claiming that their results do not “support this view.” The supporting rationale is basically unintelligible, and fails on a logical and empirical level.
On a logical level, the authors own proposal depends on the detection of areas of “approximately” homogeneous luminance, both “targets” and “contexts.” These areas are, de facto, segregated from other areas as part of the frequency-data accumulation process.
On an empirical level, the following claim fails: ““knowledge about background and foreground or edges generated by reflectance or illumination is irrelevant to a determination of the percentile of the luminance values in the relevant probability functions...concepts, like brightness, are meaningful only in a probabilistic sense, [and therefore] the statistics that generate brightness are the basis for segmentation and grouping, not the other way around.”
This statement also contains the logical problem noted previously; the empirical problem stems from the implication that, because knowledge (or inference) about foreground and background is irrelevant to the percentile of the luminance values, it is also irrelevant to “brightness” perception. But figure-ground relationships have been shown, empirically, to be key in the perception of lightness and illumination. So even if the authors' data were adequate to corroborate their claims (which it is not, in neither a narrow or a general sense) they would not be entitled to simply dismiss conflicting empirical findings.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2015 Dec 28, Lydia Maniatis commented:
Part 4. Finally, and always in keeping with the casual nature of this article, the authors are content to explain a salient feature of their data – the plateaus at the high and low ends of the distributions - as “presumably reflect[ing] the limited number of samples at these extremes.” If the shape of the distributions has theoretical significance, this offhand presumption is clearly not good enough, though apparently it was good enough for PNAS.
It should be clear that this article (and articles relying on it for their claims) sorely lacks the rigor of empirical questions and of the methods appropriate to test them, and asks the reader to take far too much on an empirically unhealthy combination of ignorance and faith.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2015 Dec 28, Lydia Maniatis commented:
Part 3. ARBITRARY, PRE-EXISTING DATABASE Samples were obtained from a “database of natural scenes.” This database was compiled by Van Hateren and Van der Schaaf (1998) for a very different purpose, without any of the concerns (e.g. that it be representative of the patterns falling on the retinas of humans and their ancestors over evolutionary time) motivating the present study. Neither the original authors nor the authors of the present article provide any rationale for the choice of scenes http://www.kyb.tuebingen.mpg.de/?id=227. As readers, we are left with the information that the scenes were “natural,” which, for practical purposes, is no information at all. Given the variety of “natural environments” encountered within and across lifetimes, it doesn't seem credible that the frequencies being predicted would happen to arise in this randomly chosen database. At the least, the authors should provide a theoretically-motivated description of the nature of these “natural images.”
VAGUE, UNMOTIVATED, OPAQUE SAMPLING CRITERIA The tested configurations were “superimposed on images to find light patterns in which the luminance values of both the surround and target regions were approximately homogeneous.” For more detail, we are directed to the supplementary material. Clearly, “approximately homogeneous,” like “natural,” is not nearly specific enough to us to either replicate or evaluate the methodological choices being made. What particular level of “approximate homogeneity,” for example, should count as match, and why? In perception, we know that a small local change may cause a global change in a characteristic of a scene, even though the scene may still be described as “approximately” the same.
The situation is not improved by the supplementary material. With respect to the database, we are told that while it has “has several limitations (limited locales, a limited luminance range, and fewer representations of very high and low luminance values than actually occur in nature), it provides a reasonable proxy of normal visual experience.” No criteria are provided or discussed as to what should constitute a “reasonable proxy” of experience. If the authors have criteria, they are keeping them to themselves. In other words, they offer no principle for selecting a database – anyone wanting to replicate this experiment would presumably need to use this particular database, and to take it on faith that it is “reasonable.”
We are told that the configurations subtended 5 degrees of visual angle, and that this was a “dimension within the range of the relevant demonstrations and psychophysical studies.” As a rationale for their methodological choice, this statement contains essentially no information. What is the acceptable range, why, how is the specific “dimension” used motivated by their theoretical rationale, and on what basis are other “dimensions” excluded? Would their inclusion affect the frequency data?
The criteria for choosing the samples were as follows: “(i) the luminances of the relevant regions had to be approximately homogeneous (standard deviation of luminance less than a calculated fraction of the mean luminance for 90% of all samples); and (ii) the luminances of two regions labeled Lu had to be similar, as did the two regions labeled Lv (absolute difference of the mean luminance of the regions £ 250 cd/m2, or » 0.5% of the luminance range in the database).”
What do the authors mean by “a calculated fraction of the mean luminance?” What was the theoretical rationale for this and the other choices as to what to accept as a valid sample?
In the aftermath of these arbitrary and vague methodological descriptions – which combine specificity with opaqueness - we are told that the templates used as such are actually very rare in nature. Thus, “Overly stringent criteria will exclude most of the samples or segregate luminance patterns, each set having very few samples. On the other hand, overly loose criteria will collapse many different luminance patterns, obscuring the variations of interest. For the sampling configurations used, in which the luminance values of different regions were homogeneous, rigid criteria are not necessary.” Why not? We are told that “the stimulus patterns in Fig. 1 generate similar perceptual effects even when they are quite noisy.” Really? This claim is made without evidence (on the authors' say so) and in terms (“similar perceptual effects;” “quite noisy”) that fail to meet the standards of empirical science. In short, we have to take it on faith that the criteria chosen by the authors – e.g. their undisclosed “calculated fraction” - are “just right.” Still, in an excess of rigor, they “ examined standard deviations of luminance less than a calculated fraction of the mean luminance for 75%, 50%, or even 10% of all samples. As long as enough samples were obtained from the database, we found quantitatively or qualitatively similar results.” What these results were, the value of “enough” and the distinction being made here between “quantitatively similar” and “qualitatively similar” is not revealed. “Regional similarity” is treated in a similar way. The criteria are judged just right because a control (is supposed to have) produced results “quantitatively or qualitatively similar in these various conditions, as long as enough samples were obtained.” It seems fair to wonder why, if alternative criteria produced “similar” (statistically significant?) results, this broader set of criteria was not employed in the main experiment?
PERCEPTUAL EFFECTS OF SAMPLES NOT TESTED We are told that samples were chosen such that they respected the local geometry of the templates, and the luminance levels, and that these sampling configurations “generated the same brightness illusions” as the templates. This is meant to be taken on faith, not data, as no observers, other than the authors, were consulted on this matter. This is unacceptable, as the validity of samples requires that they are perceptually equivalent to the templates. A small set of photos provided of some of the samples in each case shows that they are highly heterogeneous, and not obviously “similar.” It is well-known that local changes either within a figure or outside the area of interest can produce large changes in lightness. For example, a target of luminance x lying on a surround of luminance y may appear very different in lightness from another target of the same luminance x lying on a surround of the same luminance y. This can happen if one of the two entire target/surround combinations lies within an area apparently in shadow, and the other apparently in plain view. Surely some of the “equivalent” samples in current experiment were lying under differing apparent illumination, in which case the assumption that targets were perceptually equivalent, based on the frequency of the template match, would not be valid. In short it cannot be taken as a given (on the two authors' say-so) that the samples counted are perceptually equivalent to the reference stimuli. Without this information, the predictions cannot be said to have been corroborated, even for this narrow set of cases. (In the case of White's illusion, the authors make inaccurate claims even with respect to the illusion itself, claiming that the “component” of the illusion used as a template “elicited much the same effect as the usual presentation.” This is not true. The complete figure elicits a stronger effect, and one that includes a transparency effect on the lighter side.)
OTHER DATA-FREE ASSERTIONS Our faith in the authors' judgment is recruited also with respect to the issue of “scale invariance. We're assured that “Four scales, including the scale of the original images, were tested and found to produce approximately the same conditional probability distribution functions.” Data.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2015 Dec 28, Lydia Maniatis commented:
Part 2. THE HYPOTHESIS IS LAMARCKIAN The hypothesis does not appear consistent with the principles of evolution on the basis of natural selection. Unless I misunderstand, it seems to require not only that stimulus frequency information be accumulated throughout the lifetime of each organism, but also that this knowledge be passed on genetically - which is a form of Lamarckism. Natural selection doesn't work that way – mutations are random, lucky shots in the dark, not the result of accumulated, recorded luminance or responses to luminance – experience only comes into play after the fact, in the sense that if they happen to be adaptive, they undergo positive selection.
In addition, the hypothesis would seem to predict that populations living in highly different environments - one living, for example, in the arctic, the other in the dark jungle – would have different lightness codes. But even fish, for example, appear to possess color perception mechanisms similar to that of humans.
There is also a more subtle problem with the proposal; it purports to explain the production of particular qualia (light, dark) but doesn't actually doesn't touch on this problem at all. The luminance/lightness connection proposed is physically arbitrary, an accident of experience. Why should there be adaptive pressure to code physical conditions on the basis of the chance frequency with which they are encountered, rather than on the basis of features and factors directly relevant to survival and reproduction (a rare event might have more powerful effects)?
THE HYPOTHESIS IS NEITHER NECESSARY NOR SUFFICIENT AND LEAVES MANY QUESTIONS UNANSWERED The hypothesis proposes to explain why two surfaces of equal luminance, in a very limited number and type of displays, produce slightly different percepts; it does not try to explain the quite regular, in many contexts, relationship between luminance and lightness, specifically that the latter rises of falls with the other under certain describable conditions (when the structure of the pattern indicates an area under homogeneous illumination). In other words, although it is the case that the two equal target luminances in the simultaneous contrast demo differ in appearance, it is also the case that continuously increasing the luminance of either one will cause a continuous increase in its lightness. Thus, the hypothesis implies that, for any given context, the frequency of the target/context combination increases with increasing luminance of the target.
The authors seem to realize this when they state that: “by definition, the percentile of target luminance for the lowest luminance value within any contextual light pattern is 0% and corresponds to the perception of maximum darkness, the percentile for the highest luminance ...is 100% [?] and corresponds to the maximum perceivable brightness.” But they seem to be confusing prediction, definition and fact. We know as a matter of FACT – from empirical experience, not definition, not hypothesis - that increasing luminance of a patch in any given setting tends to increase its lightness. But this is not a logical implication of the frequency hypothesis (and certainly not an a priori “definition.”) Even assuming it is possible, methodologically, to show that the “percentiles” coincide with this straightforward luminance/lightness relationship, the “frequency hypothesis,” lacking any discernible rationale, would not seem to possess a logical advantage over the hypothesis that there is simply an adaptive value to a regular coding of lightness and relative luminance, i.e. to coding higher luminance with higher lightness values, with corrections for the sake of accurately separating perception of surfaces and perception of apparent illumination. The authors seem to imply that their proposal is superior to such accounts because it does not require a direct relationship between luminance and lightness – but no serious alternative could require any such thing.
Given that the fairly straightforward within -context luminance/lightness relationship, the hypothesis reduces to the idea that a change in the luminance of the background of a target can change the range of the target lightness values arising in perception. Why should this be? If the entire range for a “context” is shifted upwards, for example, should we assume that that context has a higher percentile than the lower-range-producing context, and that, again, the correspondingly regular luminance/lightness relationship within that range is a coincidence of frequency? And, finally, why does the frequency argument not apply to the apparent lightness of the background? Or does it? In this case, what happens if a low-frequency target/context combination occurs on a high-frequency context? Would the frequency of a single-luminance context equal the frequency of that luminance? Do some luminances occur more frequently than others? Or would we have to evaluate the frequency of each “context” given its own various possible contexts? And what about the fact that a more global changes are known to be able to affect local lightness (rendering any particular cut-off of “context” arbitrary and uninformative).
WHAT DOES IT MEAN THAT THE PROJECTION HAS “HIGHLY-STRUCTURED STATISTICS?” The authors make a mystifying claim about the nature of the retinal projection. We are told that it consists of 2D patterns of light intensity with “highly structured statistics.” What does this mean? How can one collect and evaluate the “structured statistics” in the pattern?
One thing is certain – the pattern of light intensities in the projection is wholly unpredictable, as is the pattern of light intensities in the environment. The light reflected to the eye depends on both the characteristics of surfaces and on the light falling on them. Both change from location to location, but the latter is also unstable within locales, shadows depending on chance locations and orientations of objects, on their shapes and relative locations, the location of the sun, the presence of clouds, etc. Given that the order of the shapes, and the shapes appearing with any given glance, is also unpredictable, it is difficult to see what the authors are talking about when they refer to “highly structured statistics.” They need to explain what they mean.
Methodological problems LACK OF TESTABILITY Even if we ignore the logical, practical and theoretical problems with the hypothesis, it seems impossible to test. Even if our sole interest were the classic simultaneous contrast illusion, and even if such configurations appeared as such in nature (which they do not), how could we develop a database for which frequencies of any target luminance on any background luminance (let alone all of them) falling on the retinas of all of the individuals forming the lineage of our species (until the process is supposed to have stopped (see above)), across all of the days and environments traversed, morning, noon and night, in the sea and on the land? The hypothesis does not entail any principled relationship between luminance and lightness, unless this relationship were in turn entailed by the luminance/frequency relationship for all possible contexts; thus, it is not clear what the criteria might be for obtaining a valid sample for testing. As it is, the notion has not been properly tested even for the narrow framing of this article, because the methods for choosing samples (the details of which are relegated to supplementary, online material) is too vague, arbitrary and opaque at key points to allow any attempts at replication.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2015 Dec 28, Lydia Maniatis commented:
Part 1. I have criticized Purves and colleagues' various iterations of the “wholly empirical” account of visual perception in various comments on PubMed Commons and on PubPeer. Here, I want to present a more comprehensive and organized critique, focussing on a study that has served as a centerpiece of these accounts. I think it is important to clarify the fact that the story is profoundly inadequate at both the conceptual and the methodological levels, and fails to meet fundamental criteria of empirical science (logical consistency and testability). Such criticism (which is not welcome in the published literature) is important, because if “anything goes” in published research at even the most elite journals, then this literature cannot help to construct the conceptual infrastructure necessary for progress in the visual sciences.
Conceptual problems LACK OF THEORETICAL MOTIVATION The hypothesis that the authors claim to have tested is effectively conjured out of thin air, subsequent to an exposition that is inaccurate, inadequate, and incoherent.
The phenomenon to be explained is inadequately described. We are told, first, that the luminance of a “visual target” elicits a “brightness” percept that can vary according to “context.”
The term brightness, here, is being used to refer to the impression of white-gray-black that a surface may elicit (and which is currently referred to by researchers as its “lightness.”) However, surfaces do not necessarily, or even usually, elicit a unitary percept; they often produce the impression of double layers, e.g. a shadow overlying a solid surface. Both the shadow and the surface have a perceptual valence, a “brightness.” Indeed, it is under conditions where one “target” appears to lie within the confines of a double layer, and the other does not, that the most extreme apparent differences between equiluminant targets arise.
The logic is simple: If a surface appears to lie under a shadow, then it will appear lighter than an equiluminant target that appears to lie outside of the shadow, because a target that emits the same amount of light under a lower illumination as one under stronger illumination must have a higher light-reflecting tendency (and this is what the visual system labels using the white-gray-black code.). The phenomenon and logic of double layers is not even intimated at by these Yang and Purves (2004); rather, the impression created is that the relationship between “brightnesses” under different “contexts” is not in the least understood or even amenable to rational analysis. (In addition, the achromatic conditions being considered are rare or even non-existent in natural, daytime conditions; but the more complex phenomenon of chromatic contrast is not mentioned.)
After this casual attempt to convince readers that vision science is completely in the dark on the subject of “brightness,” the authors assert that “A growing body of evidence has shown that the visual system uses the statistics of stimulus features in natural environments to generate visual percepts of the physical world.” No empirical studies are cited or described to clarify or support this rather opaque assertion. The only citation is of a book containing, not evidence, but “models” of brain function. The authors go on to state that “if so, the visual system must incorporate these statistics as a central feature of processing relevant to brightness and other visual qualia.” (It should be noted that the terms “statistics” and “features” here contain no concrete information. The authors could be counting the stones on Brighton beach.)
Even if we gave the authors the latitude to define “statistics” and “features” any way they wish, the following sentence would still come out of the blue: “Accordingly, we propose that the perceived brightness elicited by the luminance of a target in any given context is based on the value of the target luminance in the probability distribution function of the possible values that co-occur with that contextual luminance experienced during evolution. In particular, whenever the target luminance in a given context corresponds to a higher value in the probability distribution function of the possible luminance values in that context, the brightness of the target will be greater than the brightness elicited by the same luminance in contexts in which that luminance has a lower value in the probability distribution function.”
This frequency-percept correlation, in combination with the claim that it has come about on the basis of evolution by natural selection, is the working hypothesis. It is a pure guess; it does not follow naturally from anything that has come before, nor, for that matter, from the principles of natural selection (see below). It is, frankly, bizarre. If it could be corroborated, it would be a result in need of an explanation. As it is, it has not been corroborated, or even tested, nor is it amenable to testing (see below).
THE HYPOTHESIS The hypothesis offered is said to “explain” a narrow set of lightness demonstrations. Each consists of two displays, both containing a surface of luminance x; despite being equiluminant, these surfaces differ in their appearance, one appearing lighter than the other.
The claim is as follows: The visual system has evolved to represent as lighter those surfaces that appear more frequently in one “context” than in another. Thus, in each display, the surface that appears lighter must have been encountered more often in that particular “context” during the course of evolution. It is because seeing the higher-frequency target/context combination as lighter was “optimal” this situation arose. More specifically, it is because the response “lighter” to a target in one “context” and “darker” to a target in another context, and so on for contexts in-between, has had, over evolutionary time, positive adaptive effects, that these percepts have become instantiated in the visual process.
DO ORGANISMS TRACK ABSOLUTE AND RELATIVE LIGHT INTENSITY OVER MOMENTS, HOURS, DAYS, EONS, AT ALL POINTS IN THE RETINAL IMAGE, AND WHEN DID THEY STOP DOING THIS? The hypothesis would appear to presuppose that organisms can discriminate between, and keep track of, the absolute luminances of various “targets” in various “contexts,” as they have been encountered with every glance, of every individual (or at least the ancestors of every individual), at every point in the visual image, every moment, hour, day, across the evolutionary trajectory of the species. There is no suggestion as to how these absolute luminances might be tracked, even “in effect.” Not only is it difficult to imagine what mechanism (other than a miracle) might allow such (species-wide) data collection (or its equivalent) to arise, it runs contrary to the physiology of the visual system, in which inhibitory mechanisms ensure that relative, not absolute, luminance information, is coded even at the lowest levels of the nervous system. Furthermore, given that lightness perception does not change across an individual's lifetime, nor depend on their particular visual experience, it would seem that this process of statistical accumulation must have stopped at some point during our evolution. So we have to ask, when do the authors suppose this (impossible) process to have stopped, why did it stop, and what were the relevant environments at that time and previously? (The logical assumption that the authors suppose the process has, in fact, stopped is reinforced when they refer to “instantiated” rather than developing “statistical structures.”)
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-
- Feb 2018
-
europepmc.org europepmc.org
-
On 2015 Dec 28, Lydia Maniatis commented:
Part 1. I have criticized Purves and colleagues' various iterations of the “wholly empirical” account of visual perception in various comments on PubMed Commons and on PubPeer. Here, I want to present a more comprehensive and organized critique, focussing on a study that has served as a centerpiece of these accounts. I think it is important to clarify the fact that the story is profoundly inadequate at both the conceptual and the methodological levels, and fails to meet fundamental criteria of empirical science (logical consistency and testability). Such criticism (which is not welcome in the published literature) is important, because if “anything goes” in published research at even the most elite journals, then this literature cannot help to construct the conceptual infrastructure necessary for progress in the visual sciences.
Conceptual problems LACK OF THEORETICAL MOTIVATION The hypothesis that the authors claim to have tested is effectively conjured out of thin air, subsequent to an exposition that is inaccurate, inadequate, and incoherent.
The phenomenon to be explained is inadequately described. We are told, first, that the luminance of a “visual target” elicits a “brightness” percept that can vary according to “context.”
The term brightness, here, is being used to refer to the impression of white-gray-black that a surface may elicit (and which is currently referred to by researchers as its “lightness.”) However, surfaces do not necessarily, or even usually, elicit a unitary percept; they often produce the impression of double layers, e.g. a shadow overlying a solid surface. Both the shadow and the surface have a perceptual valence, a “brightness.” Indeed, it is under conditions where one “target” appears to lie within the confines of a double layer, and the other does not, that the most extreme apparent differences between equiluminant targets arise.
The logic is simple: If a surface appears to lie under a shadow, then it will appear lighter than an equiluminant target that appears to lie outside of the shadow, because a target that emits the same amount of light under a lower illumination as one under stronger illumination must have a higher light-reflecting tendency (and this is what the visual system labels using the white-gray-black code.). The phenomenon and logic of double layers is not even intimated at by these Yang and Purves (2004); rather, the impression created is that the relationship between “brightnesses” under different “contexts” is not in the least understood or even amenable to rational analysis. (In addition, the achromatic conditions being considered are rare or even non-existent in natural, daytime conditions; but the more complex phenomenon of chromatic contrast is not mentioned.)
After this casual attempt to convince readers that vision science is completely in the dark on the subject of “brightness,” the authors assert that “A growing body of evidence has shown that the visual system uses the statistics of stimulus features in natural environments to generate visual percepts of the physical world.” No empirical studies are cited or described to clarify or support this rather opaque assertion. The only citation is of a book containing, not evidence, but “models” of brain function. The authors go on to state that “if so, the visual system must incorporate these statistics as a central feature of processing relevant to brightness and other visual qualia.” (It should be noted that the terms “statistics” and “features” here contain no concrete information. The authors could be counting the stones on Brighton beach.)
Even if we gave the authors the latitude to define “statistics” and “features” any way they wish, the following sentence would still come out of the blue: “Accordingly, we propose that the perceived brightness elicited by the luminance of a target in any given context is based on the value of the target luminance in the probability distribution function of the possible values that co-occur with that contextual luminance experienced during evolution. In particular, whenever the target luminance in a given context corresponds to a higher value in the probability distribution function of the possible luminance values in that context, the brightness of the target will be greater than the brightness elicited by the same luminance in contexts in which that luminance has a lower value in the probability distribution function.”
This frequency-percept correlation, in combination with the claim that it has come about on the basis of evolution by natural selection, is the working hypothesis. It is a pure guess; it does not follow naturally from anything that has come before, nor, for that matter, from the principles of natural selection (see below). It is, frankly, bizarre. If it could be corroborated, it would be a result in need of an explanation. As it is, it has not been corroborated, or even tested, nor is it amenable to testing (see below).
THE HYPOTHESIS The hypothesis offered is said to “explain” a narrow set of lightness demonstrations. Each consists of two displays, both containing a surface of luminance x; despite being equiluminant, these surfaces differ in their appearance, one appearing lighter than the other.
The claim is as follows: The visual system has evolved to represent as lighter those surfaces that appear more frequently in one “context” than in another. Thus, in each display, the surface that appears lighter must have been encountered more often in that particular “context” during the course of evolution. It is because seeing the higher-frequency target/context combination as lighter was “optimal” this situation arose. More specifically, it is because the response “lighter” to a target in one “context” and “darker” to a target in another context, and so on for contexts in-between, has had, over evolutionary time, positive adaptive effects, that these percepts have become instantiated in the visual process.
DO ORGANISMS TRACK ABSOLUTE AND RELATIVE LIGHT INTENSITY OVER MOMENTS, HOURS, DAYS, EONS, AT ALL POINTS IN THE RETINAL IMAGE, AND WHEN DID THEY STOP DOING THIS? The hypothesis would appear to presuppose that organisms can discriminate between, and keep track of, the absolute luminances of various “targets” in various “contexts,” as they have been encountered with every glance, of every individual (or at least the ancestors of every individual), at every point in the visual image, every moment, hour, day, across the evolutionary trajectory of the species. There is no suggestion as to how these absolute luminances might be tracked, even “in effect.” Not only is it difficult to imagine what mechanism (other than a miracle) might allow such (species-wide) data collection (or its equivalent) to arise, it runs contrary to the physiology of the visual system, in which inhibitory mechanisms ensure that relative, not absolute, luminance information, is coded even at the lowest levels of the nervous system. Furthermore, given that lightness perception does not change across an individual's lifetime, nor depend on their particular visual experience, it would seem that this process of statistical accumulation must have stopped at some point during our evolution. So we have to ask, when do the authors suppose this (impossible) process to have stopped, why did it stop, and what were the relevant environments at that time and previously? (The logical assumption that the authors suppose the process has, in fact, stopped is reinforced when they refer to “instantiated” rather than developing “statistical structures.”)
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2015 Dec 28, Lydia Maniatis commented:
Part 2. THE HYPOTHESIS IS LAMARCKIAN The hypothesis does not appear consistent with the principles of evolution on the basis of natural selection. Unless I misunderstand, it seems to require not only that stimulus frequency information be accumulated throughout the lifetime of each organism, but also that this knowledge be passed on genetically - which is a form of Lamarckism. Natural selection doesn't work that way – mutations are random, lucky shots in the dark, not the result of accumulated, recorded luminance or responses to luminance – experience only comes into play after the fact, in the sense that if they happen to be adaptive, they undergo positive selection.
In addition, the hypothesis would seem to predict that populations living in highly different environments - one living, for example, in the arctic, the other in the dark jungle – would have different lightness codes. But even fish, for example, appear to possess color perception mechanisms similar to that of humans.
There is also a more subtle problem with the proposal; it purports to explain the production of particular qualia (light, dark) but doesn't actually doesn't touch on this problem at all. The luminance/lightness connection proposed is physically arbitrary, an accident of experience. Why should there be adaptive pressure to code physical conditions on the basis of the chance frequency with which they are encountered, rather than on the basis of features and factors directly relevant to survival and reproduction (a rare event might have more powerful effects)?
THE HYPOTHESIS IS NEITHER NECESSARY NOR SUFFICIENT AND LEAVES MANY QUESTIONS UNANSWERED The hypothesis proposes to explain why two surfaces of equal luminance, in a very limited number and type of displays, produce slightly different percepts; it does not try to explain the quite regular, in many contexts, relationship between luminance and lightness, specifically that the latter rises of falls with the other under certain describable conditions (when the structure of the pattern indicates an area under homogeneous illumination). In other words, although it is the case that the two equal target luminances in the simultaneous contrast demo differ in appearance, it is also the case that continuously increasing the luminance of either one will cause a continuous increase in its lightness. Thus, the hypothesis implies that, for any given context, the frequency of the target/context combination increases with increasing luminance of the target.
The authors seem to realize this when they state that: “by definition, the percentile of target luminance for the lowest luminance value within any contextual light pattern is 0% and corresponds to the perception of maximum darkness, the percentile for the highest luminance ...is 100% [?] and corresponds to the maximum perceivable brightness.” But they seem to be confusing prediction, definition and fact. We know as a matter of FACT – from empirical experience, not definition, not hypothesis - that increasing luminance of a patch in any given setting tends to increase its lightness. But this is not a logical implication of the frequency hypothesis (and certainly not an a priori “definition.”) Even assuming it is possible, methodologically, to show that the “percentiles” coincide with this straightforward luminance/lightness relationship, the “frequency hypothesis,” lacking any discernible rationale, would not seem to possess a logical advantage over the hypothesis that there is simply an adaptive value to a regular coding of lightness and relative luminance, i.e. to coding higher luminance with higher lightness values, with corrections for the sake of accurately separating perception of surfaces and perception of apparent illumination. The authors seem to imply that their proposal is superior to such accounts because it does not require a direct relationship between luminance and lightness – but no serious alternative could require any such thing.
Given that the fairly straightforward within -context luminance/lightness relationship, the hypothesis reduces to the idea that a change in the luminance of the background of a target can change the range of the target lightness values arising in perception. Why should this be? If the entire range for a “context” is shifted upwards, for example, should we assume that that context has a higher percentile than the lower-range-producing context, and that, again, the correspondingly regular luminance/lightness relationship within that range is a coincidence of frequency? And, finally, why does the frequency argument not apply to the apparent lightness of the background? Or does it? In this case, what happens if a low-frequency target/context combination occurs on a high-frequency context? Would the frequency of a single-luminance context equal the frequency of that luminance? Do some luminances occur more frequently than others? Or would we have to evaluate the frequency of each “context” given its own various possible contexts? And what about the fact that a more global changes are known to be able to affect local lightness (rendering any particular cut-off of “context” arbitrary and uninformative).
WHAT DOES IT MEAN THAT THE PROJECTION HAS “HIGHLY-STRUCTURED STATISTICS?” The authors make a mystifying claim about the nature of the retinal projection. We are told that it consists of 2D patterns of light intensity with “highly structured statistics.” What does this mean? How can one collect and evaluate the “structured statistics” in the pattern?
One thing is certain – the pattern of light intensities in the projection is wholly unpredictable, as is the pattern of light intensities in the environment. The light reflected to the eye depends on both the characteristics of surfaces and on the light falling on them. Both change from location to location, but the latter is also unstable within locales, shadows depending on chance locations and orientations of objects, on their shapes and relative locations, the location of the sun, the presence of clouds, etc. Given that the order of the shapes, and the shapes appearing with any given glance, is also unpredictable, it is difficult to see what the authors are talking about when they refer to “highly structured statistics.” They need to explain what they mean.
Methodological problems LACK OF TESTABILITY Even if we ignore the logical, practical and theoretical problems with the hypothesis, it seems impossible to test. Even if our sole interest were the classic simultaneous contrast illusion, and even if such configurations appeared as such in nature (which they do not), how could we develop a database for which frequencies of any target luminance on any background luminance (let alone all of them) falling on the retinas of all of the individuals forming the lineage of our species (until the process is supposed to have stopped (see above)), across all of the days and environments traversed, morning, noon and night, in the sea and on the land? The hypothesis does not entail any principled relationship between luminance and lightness, unless this relationship were in turn entailed by the luminance/frequency relationship for all possible contexts; thus, it is not clear what the criteria might be for obtaining a valid sample for testing. As it is, the notion has not been properly tested even for the narrow framing of this article, because the methods for choosing samples (the details of which are relegated to supplementary, online material) is too vague, arbitrary and opaque at key points to allow any attempts at replication.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2015 Dec 28, Lydia Maniatis commented:
Part 3. ARBITRARY, PRE-EXISTING DATABASE Samples were obtained from a “database of natural scenes.” This database was compiled by Van Hateren and Van der Schaaf (1998) for a very different purpose, without any of the concerns (e.g. that it be representative of the patterns falling on the retinas of humans and their ancestors over evolutionary time) motivating the present study. Neither the original authors nor the authors of the present article provide any rationale for the choice of scenes http://www.kyb.tuebingen.mpg.de/?id=227. As readers, we are left with the information that the scenes were “natural,” which, for practical purposes, is no information at all. Given the variety of “natural environments” encountered within and across lifetimes, it doesn't seem credible that the frequencies being predicted would happen to arise in this randomly chosen database. At the least, the authors should provide a theoretically-motivated description of the nature of these “natural images.”
VAGUE, UNMOTIVATED, OPAQUE SAMPLING CRITERIA The tested configurations were “superimposed on images to find light patterns in which the luminance values of both the surround and target regions were approximately homogeneous.” For more detail, we are directed to the supplementary material. Clearly, “approximately homogeneous,” like “natural,” is not nearly specific enough to us to either replicate or evaluate the methodological choices being made. What particular level of “approximate homogeneity,” for example, should count as match, and why? In perception, we know that a small local change may cause a global change in a characteristic of a scene, even though the scene may still be described as “approximately” the same.
The situation is not improved by the supplementary material. With respect to the database, we are told that while it has “has several limitations (limited locales, a limited luminance range, and fewer representations of very high and low luminance values than actually occur in nature), it provides a reasonable proxy of normal visual experience.” No criteria are provided or discussed as to what should constitute a “reasonable proxy” of experience. If the authors have criteria, they are keeping them to themselves. In other words, they offer no principle for selecting a database – anyone wanting to replicate this experiment would presumably need to use this particular database, and to take it on faith that it is “reasonable.”
We are told that the configurations subtended 5 degrees of visual angle, and that this was a “dimension within the range of the relevant demonstrations and psychophysical studies.” As a rationale for their methodological choice, this statement contains essentially no information. What is the acceptable range, why, how is the specific “dimension” used motivated by their theoretical rationale, and on what basis are other “dimensions” excluded? Would their inclusion affect the frequency data?
The criteria for choosing the samples were as follows: “(i) the luminances of the relevant regions had to be approximately homogeneous (standard deviation of luminance less than a calculated fraction of the mean luminance for 90% of all samples); and (ii) the luminances of two regions labeled Lu had to be similar, as did the two regions labeled Lv (absolute difference of the mean luminance of the regions £ 250 cd/m2, or » 0.5% of the luminance range in the database).”
What do the authors mean by “a calculated fraction of the mean luminance?” What was the theoretical rationale for this and the other choices as to what to accept as a valid sample?
In the aftermath of these arbitrary and vague methodological descriptions – which combine specificity with opaqueness - we are told that the templates used as such are actually very rare in nature. Thus, “Overly stringent criteria will exclude most of the samples or segregate luminance patterns, each set having very few samples. On the other hand, overly loose criteria will collapse many different luminance patterns, obscuring the variations of interest. For the sampling configurations used, in which the luminance values of different regions were homogeneous, rigid criteria are not necessary.” Why not? We are told that “the stimulus patterns in Fig. 1 generate similar perceptual effects even when they are quite noisy.” Really? This claim is made without evidence (on the authors' say so) and in terms (“similar perceptual effects;” “quite noisy”) that fail to meet the standards of empirical science. In short, we have to take it on faith that the criteria chosen by the authors – e.g. their undisclosed “calculated fraction” - are “just right.” Still, in an excess of rigor, they “ examined standard deviations of luminance less than a calculated fraction of the mean luminance for 75%, 50%, or even 10% of all samples. As long as enough samples were obtained from the database, we found quantitatively or qualitatively similar results.” What these results were, the value of “enough” and the distinction being made here between “quantitatively similar” and “qualitatively similar” is not revealed. “Regional similarity” is treated in a similar way. The criteria are judged just right because a control (is supposed to have) produced results “quantitatively or qualitatively similar in these various conditions, as long as enough samples were obtained.” It seems fair to wonder why, if alternative criteria produced “similar” (statistically significant?) results, this broader set of criteria was not employed in the main experiment?
PERCEPTUAL EFFECTS OF SAMPLES NOT TESTED We are told that samples were chosen such that they respected the local geometry of the templates, and the luminance levels, and that these sampling configurations “generated the same brightness illusions” as the templates. This is meant to be taken on faith, not data, as no observers, other than the authors, were consulted on this matter. This is unacceptable, as the validity of samples requires that they are perceptually equivalent to the templates. A small set of photos provided of some of the samples in each case shows that they are highly heterogeneous, and not obviously “similar.” It is well-known that local changes either within a figure or outside the area of interest can produce large changes in lightness. For example, a target of luminance x lying on a surround of luminance y may appear very different in lightness from another target of the same luminance x lying on a surround of the same luminance y. This can happen if one of the two entire target/surround combinations lies within an area apparently in shadow, and the other apparently in plain view. Surely some of the “equivalent” samples in current experiment were lying under differing apparent illumination, in which case the assumption that targets were perceptually equivalent, based on the frequency of the template match, would not be valid. In short it cannot be taken as a given (on the two authors' say-so) that the samples counted are perceptually equivalent to the reference stimuli. Without this information, the predictions cannot be said to have been corroborated, even for this narrow set of cases. (In the case of White's illusion, the authors make inaccurate claims even with respect to the illusion itself, claiming that the “component” of the illusion used as a template “elicited much the same effect as the usual presentation.” This is not true. The complete figure elicits a stronger effect, and one that includes a transparency effect on the lighter side.)
OTHER DATA-FREE ASSERTIONS Our faith in the authors' judgment is recruited also with respect to the issue of “scale invariance. We're assured that “Four scales, including the scale of the original images, were tested and found to produce approximately the same conditional probability distribution functions.” Data.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2015 Dec 28, Lydia Maniatis commented:
Part 4. Finally, and always in keeping with the casual nature of this article, the authors are content to explain a salient feature of their data – the plateaus at the high and low ends of the distributions - as “presumably reflect[ing] the limited number of samples at these extremes.” If the shape of the distributions has theoretical significance, this offhand presumption is clearly not good enough, though apparently it was good enough for PNAS.
It should be clear that this article (and articles relying on it for their claims) sorely lacks the rigor of empirical questions and of the methods appropriate to test them, and asks the reader to take far too much on an empirically unhealthy combination of ignorance and faith.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2015 Dec 28, Lydia Maniatis commented:
The authors reject the idea that the perception of “brightness” is contingent on the segmentation of the visual field into “luminance patches,” claiming that their results do not “support this view.” The supporting rationale is basically unintelligible, and fails on a logical and empirical level.
On a logical level, the authors own proposal depends on the detection of areas of “approximately” homogeneous luminance, both “targets” and “contexts.” These areas are, de facto, segregated from other areas as part of the frequency-data accumulation process.
On an empirical level, the following claim fails: ““knowledge about background and foreground or edges generated by reflectance or illumination is irrelevant to a determination of the percentile of the luminance values in the relevant probability functions...concepts, like brightness, are meaningful only in a probabilistic sense, [and therefore] the statistics that generate brightness are the basis for segmentation and grouping, not the other way around.”
This statement also contains the logical problem noted previously; the empirical problem stems from the implication that, because knowledge (or inference) about foreground and background is irrelevant to the percentile of the luminance values, it is also irrelevant to “brightness” perception. But figure-ground relationships have been shown, empirically, to be key in the perception of lightness and illumination. So even if the authors' data were adequate to corroborate their claims (which it is not, in neither a narrow or a general sense) they would not be entitled to simply dismiss conflicting empirical findings.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-