- Jul 2018
-
europepmc.org europepmc.org
-
On 2017 Mar 06, Lydia Maniatis commented:
The authors’ theoretical position does not seem coherent. They are making an unintelligible distinction between what they call “low-level” stimulus features – among which they list “brightness, contrast, color or spatial frequency” – on the one hand, and “high-level information such as depth cues.” The latter include “texture and shading.” But in an image, the latter are simply descriptions of perceptual effects of variations in luminance, etc. For example, in a black and white photo what we might refer to as “shading” is objectively changes in the luminance of the surface, and the reaction of our visual system to these variations. Similarly for texture. So when they say that “The perception of depth can have the effect of over-riding some of the salient 2-D cues,” one wonders whether they mean to suggest that the “perception of depth” is based on some kind of clairvoyance. And when they say that “The results lend support to a depth cue invariant mechanism for object inspection and gaze planning” they’re basically just saying “how we look at something depends on what it looks like.” And what it looks like depends on…
With respect to the division of perceptual features into “high-level” and “low-level,” this is also a theoretical non-starter, which I’ve discussed in various comments, including a recent one on Schmid and Anderson (2017), copied below.
The methods are for the most part pre-packaged, from various sources. Their theoretical underpinnings are questionable. The figure 2 example of a face based on texture just doesn’t look like a face at all. We’re told that it was generated using the method described by Liu et al (2005). I guess that will have to do…The use of forced choices is indefensible, resulting in the loss of information and the need to invent untestable “guess rates” and “lapse rates:” “The guessing and the lapse rates were fixed to 0.25 and 0.001, respectively.” The stimuli were vary ambiguous, rendering the recognition task difficult, which necessitated certain post hoc measures to clean up the data. Basically we end up comparing a couple of arbitrary manipulations without any interpretable theoretical significance.
From comment on Schmid and Anderson (2017) https://pubpeer.com/publications/8BCF47A7F782E357ECF987E5DBFC55#fb117951
“The present findings demonstrate that it is difficult to tease apart low-level (e.g., contrast) and midlevel (e.g., transparency) contributions to lightness phenomena in simple displays… Dissociating midlevel transparency explanations from low-level contrast explanations of the crispening effect will always be problematic, as by definition information is processed by “low-level” mechanisms before higher visual processing areas responsible for the midlevel segmentation of surfaces.”
As the above passage indicates, the authors of this article are endorsing the (untenable but common) notion that, within a visual percept, some features are reflections of “low-level” processes, i.e. activities of neurons at anatomical levels nearer to the retinal starting point, while other features are reflections of the activities of “mid-level” neurons, later in the anatomical pathway. Still others, presumably, are reflections of the activities of “high-level” neurons. Thus, when we observe that a grey square on a dark grey background appears lighter than the same grey square on light grey background, this is the result of “low-level” firing patterns, while if we perceive a grey film overlying both squares and backgrounds (an effect we can achieve by simply making certain alterations in the wider configuration, leaving the "target" area untouched), this is a consequence of “mid-level” firing activity. And so on. Relatedly, the story goes, we can effectively observe and analyze neural processes at selected levels by examining selected elements of the percepts to which various stimuli give rise.
These assumptions are not based on any evidence or rational arguments; the arguments, in fact, are all against.
That such a view constitutes a gross and unwarranted oversimplification of an unimaginably complex system whose mechanics, and the relationships between those mechanics and perception, we are not even close to understanding, should be self-evident.
Even if this were not the case, the view is paradoxical. It’s paradoxical for many reasons, but I’ll focus on one here. We know that at any given location in the visual percept – any patch – what is perceived – with respect to any and all features – is contingent on the entire area of stimulation. That is, with respect to the percept, we are not dealing with a process of “and-sum.” This has been demonstrated ad infinitum.
But the invocation of “low-level” processes is simultaneously an invocation of “local” processes. So to say that the color of area “x” in this visual percept is the product of local process “y” is tantamount to saying that for some reason, the normal, organized feedback/feedforward response to the retinal stimulation stopped short at this low-level. But when and how does the system decide when to stop processing at the lower-level? Wouldn't some process higher up, with a global perspective, need to ok this shutting down of the more global process (to be sure, for example, that a more extended view doesn’t imply transparency)? And if so, would we still be justified in attributing the feature to a low-level process?
In addition, the “mid-level segmentation of surfaces” has strong effects on perceived lightness; are these supposed to be added to the “low-level contrast effects” (with the "low-level" info simultaneously underpinning the "mid-level" activity)? A rationale is desperately needed.
Arbitrarily interpreting the visual percept in terms of piecemeal processes for one feature and semi- global processes for another and entirely global processes for a third, and some or all at the same time is not a coherent position.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-
- Feb 2018
-
europepmc.org europepmc.org
-
On 2017 Mar 06, Lydia Maniatis commented:
The authors’ theoretical position does not seem coherent. They are making an unintelligible distinction between what they call “low-level” stimulus features – among which they list “brightness, contrast, color or spatial frequency” – on the one hand, and “high-level information such as depth cues.” The latter include “texture and shading.” But in an image, the latter are simply descriptions of perceptual effects of variations in luminance, etc. For example, in a black and white photo what we might refer to as “shading” is objectively changes in the luminance of the surface, and the reaction of our visual system to these variations. Similarly for texture. So when they say that “The perception of depth can have the effect of over-riding some of the salient 2-D cues,” one wonders whether they mean to suggest that the “perception of depth” is based on some kind of clairvoyance. And when they say that “The results lend support to a depth cue invariant mechanism for object inspection and gaze planning” they’re basically just saying “how we look at something depends on what it looks like.” And what it looks like depends on…
With respect to the division of perceptual features into “high-level” and “low-level,” this is also a theoretical non-starter, which I’ve discussed in various comments, including a recent one on Schmid and Anderson (2017), copied below.
The methods are for the most part pre-packaged, from various sources. Their theoretical underpinnings are questionable. The figure 2 example of a face based on texture just doesn’t look like a face at all. We’re told that it was generated using the method described by Liu et al (2005). I guess that will have to do…The use of forced choices is indefensible, resulting in the loss of information and the need to invent untestable “guess rates” and “lapse rates:” “The guessing and the lapse rates were fixed to 0.25 and 0.001, respectively.” The stimuli were vary ambiguous, rendering the recognition task difficult, which necessitated certain post hoc measures to clean up the data. Basically we end up comparing a couple of arbitrary manipulations without any interpretable theoretical significance.
From comment on Schmid and Anderson (2017) https://pubpeer.com/publications/8BCF47A7F782E357ECF987E5DBFC55#fb117951
“The present findings demonstrate that it is difficult to tease apart low-level (e.g., contrast) and midlevel (e.g., transparency) contributions to lightness phenomena in simple displays… Dissociating midlevel transparency explanations from low-level contrast explanations of the crispening effect will always be problematic, as by definition information is processed by “low-level” mechanisms before higher visual processing areas responsible for the midlevel segmentation of surfaces.”
As the above passage indicates, the authors of this article are endorsing the (untenable but common) notion that, within a visual percept, some features are reflections of “low-level” processes, i.e. activities of neurons at anatomical levels nearer to the retinal starting point, while other features are reflections of the activities of “mid-level” neurons, later in the anatomical pathway. Still others, presumably, are reflections of the activities of “high-level” neurons. Thus, when we observe that a grey square on a dark grey background appears lighter than the same grey square on light grey background, this is the result of “low-level” firing patterns, while if we perceive a grey film overlying both squares and backgrounds (an effect we can achieve by simply making certain alterations in the wider configuration, leaving the "target" area untouched), this is a consequence of “mid-level” firing activity. And so on. Relatedly, the story goes, we can effectively observe and analyze neural processes at selected levels by examining selected elements of the percepts to which various stimuli give rise.
These assumptions are not based on any evidence or rational arguments; the arguments, in fact, are all against.
That such a view constitutes a gross and unwarranted oversimplification of an unimaginably complex system whose mechanics, and the relationships between those mechanics and perception, we are not even close to understanding, should be self-evident.
Even if this were not the case, the view is paradoxical. It’s paradoxical for many reasons, but I’ll focus on one here. We know that at any given location in the visual percept – any patch – what is perceived – with respect to any and all features – is contingent on the entire area of stimulation. That is, with respect to the percept, we are not dealing with a process of “and-sum.” This has been demonstrated ad infinitum.
But the invocation of “low-level” processes is simultaneously an invocation of “local” processes. So to say that the color of area “x” in this visual percept is the product of local process “y” is tantamount to saying that for some reason, the normal, organized feedback/feedforward response to the retinal stimulation stopped short at this low-level. But when and how does the system decide when to stop processing at the lower-level? Wouldn't some process higher up, with a global perspective, need to ok this shutting down of the more global process (to be sure, for example, that a more extended view doesn’t imply transparency)? And if so, would we still be justified in attributing the feature to a low-level process?
In addition, the “mid-level segmentation of surfaces” has strong effects on perceived lightness; are these supposed to be added to the “low-level contrast effects” (with the "low-level" info simultaneously underpinning the "mid-level" activity)? A rationale is desperately needed.
Arbitrarily interpreting the visual percept in terms of piecemeal processes for one feature and semi- global processes for another and entirely global processes for a third, and some or all at the same time is not a coherent position.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-