- Jul 2018
-
europepmc.org europepmc.org
-
On 2016 Nov 24, Lydia Maniatis commented:
As should be evident from the corresponding PubPeer discussion, I have to disagree with all of Guy’s claims. I think logic and evidence are on my side. Not only is there not “considerable evidence that humans acquire knowledge of how depth cues work from experience,” the evidence and logic are all on the opposite side. The naïve use of the term “object” and reference to how objects change “as we approach or touch them and learn about how they change in size, aerial perspective, linear perspective etc” indicates a failure to understand the fundamental problem of perception, i.e. how the proximal stimulus, which does not consist objects of any size or shape, is metamorphosed into a shaped 3D percept. Perceiving 3D shape presupposes depth perception. As Gilchrist (2003) points out in a critical Nature review of Purves and Lotto’s book, “Why we see what we do:” “Infant habituation studies show that size and shape are perceived correctly on the first day of life. The baby regards a small nearby object and a distant larger object as different even when they make the same retinal image. But newborns can recognize an object placed at two different distances as the same object, despite the different retinal size, or the same rectangle placed at different slants. How can the newborn learn something so sophisticated in matter of hours?” Gilchrist also addresses the logical problems of the “learning” perspective (caps mine): “In the 18th C, George Berkeley argued that touch educates vision. However, this merely displaces the problem. Tactile stimulation is even more ambiguous than retinal stimulation, and the weight of the evidence show that vision educates touch, not vice versa. Purves and Lotto speak of what the ambiguous stimulus “turned out to signify in past experience.” But exactly how did it turn out thus? WHAT IS THE SOURCE OF FEEDBACK THAT RESOLVES THE AMBIGUITY?” “Learning” proponents consistently fail to acknowledge, let alone attempt to answer, this last question. As I point out on PubPeer, if touch helps us to learn to see, then the wide use of touchscreens by children should presumably compromise 3D perception, since the tactile feedback is presumably indicative of flatness at all times.
The confusion is evident in Guy’s reference to the “trusted cue – occlusion implying depth.” Again, there is a naïve use of the term “occlusion.” Obviously, the image observers see on the screen isn’t occluded, it’s just a pattern of colored points. With respect to both the screen and the retinal stimulation, there is no occlusion because there are no objects. Occlusion is a perceptual, not a physical, fact as far as the proximal stimulus is concerned. So the cue itself is an inferred construct intimately linked to object perception. So we’re forced to ask, what cued the cue…and so on, ad infinitum. Ultimately, we’re forced to go back to brass tacks, to tackle figure ground organization via general principles of organization. Even if we accepted that there could (somehow) be unambiguous cues, we would still have the problem that each retinal image is unique, so we would need a different cue - and thus an infinite number of cues- to handle all of the ambiguity. Which makes the use of “cues” redundant.
So the notion that “one might not need much to allow a self-organising system of cue to rapidly ‘boot-strap’itself into a robust system in which myriad sensory cues are integrated optimally” is clearly untenable if we try to actually work through what it implies. The concept of ‘cue recruitment’ throws up a lot of concerns only because even its provisional acceptance requires that we accept unacceptable assumptions.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2016 Nov 21, Guy M Wallis commented:
Lydia raises an important question. Surely we can't learn everything! We need something to hang our perceptual hat on to get the ball rolling. After all, in the experiments described in our paper, Ben and I relied on the presence of a trusted cue - occlusion implying depth - to allow the observer to harness the new cue which we imposed - arm movement. But where did knowledge of the trusted depth cue come from? Did we have to learn that too? Well there is considerable empirical evidence that humans do acquire knowledge of how depth cues work from experience. We observe objects as we approach or touch them and learn about how they change in size, aerial perspective, linear perspective etc. But it also seems likely that some cues have been acquired in phylogentic time due to their reliability and utility. The apparently in-built assumption that lighting in a scene comes from above and the left may be an example of this. In the end though, one might not need much to allow a self-organising system of cues to rapidly 'boot-strap' itself into a robust system in which myriad sensory cues are integrated optimally.
Lydia and my co-author, Benjamin Backus, have been engaged in a lively and informative exchange on PubPeer which I recommend to those interested in this debate. The concept of cue recruitment throws up a lot of concerns and queries.
https://pubpeer.com/publications/2622B45C885243AFCB5C604CB0638B
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2016 Nov 12, Lydia Maniatis commented:
It occurs to me that the "cue recruitment theory" is susceptible to the problem of infinite regress. If percepts are by their nature ambiguous, and require "cues" to disambiguate, then aren't the cues, which are also perceptual articles, also in need of disambiguation? Don't we need to cue the cue? And so on....
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2016 Nov 12, Lydia Maniatis commented:
Two (probably) final points regarding the authors' conclusion quoted below:
"In conclusion, the present study presents evidence that a voluntary action (arm movement) can influence visual perceptual processes. We suggest that this relationship may develop through an already functional link between motor behavior and the visual system (Cisek & Kalaska, 2010; Fagioli et al., 2007; Wohlschläger & Wohlschläger, 1998). Through the associative learning paradigm used here, this relationship can be modified to enable arbitrary relationships between limb movement and perceived motion of a perceptually ambiguous stimulus. "
First, most stimuli are not perceptually ambiguous (i.e. they are not bistable or multistable), so the relevance of this putative finding is questionable in practice, and would require much more development in theory.
Second, the claim that it is possible to construct "arbitrary relationships between limb movement and perceived motion of a perceptually ambiguous stimulus" is a radical behaviorist claim, of a type that has consistently been falsified both logically and empirically.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2016 Nov 12, Lydia Maniatis commented:
The degree of uncertainty incorporated into this study in the form of confounds means that the claims at the front end carry no weight.
Essentially, the authors apparently are employing a forced choice paradigm. (They don’t refer to it as such, but rather as a “dichotomous perceptual decision.” Their stimulus is bistable, unstable, briefly presented, temporally decaying and the response relies on memory as it occurs after the image has left the screen. Their training procedure likely produces expectations that may bias outcomes.
The highly unstable nature of the Necker cube, even in static form, is self-evident. I don’t know if this is mitigated by motion, but I doubt it. I would expect the uncertainty to be even greater when the square face of the figure isn’t in a vertical/horizontal orientation.
In their discussion, the authors address the possibility of response bias in their study: “Firestone and Scholl (in press)…include a section on action-based influences on perception. The authors argue that much of this literature is polluted with response bias and that suitable control studies have undermined many of the earlier findings.”
Wallis and Backus counter this possibility with a straw man. “If participants were trying to respond in a manner they thought we might expect, there is no reason why they would not have done so in the passive conditions…”
However, the question isn’t only whether participants were trying to meet investigator expectations, but whether they had developed expectations of their own based on the “training” procedures.
In the so-called passive training condition, an arrow, either congruent or incongruent, was associated with the rotation of a disambiguated Necker cube. However, in this condition observers have no incentive to pay attention to this peripheral form and its connection with the area of interest. In the active condition, in contrast, it is necessary attend the arrows and to act on them. This obligation to act on the arrows while observing the figure ensures that attention is paid to their connection with cube rotation.
The conceptual and methodological uncertainty is compounded by the fact that the authors themselves can’t explain (though they presumably expected it) the failure of the arrows alone to produce a perceptual bias. As with the previous issue, they dispense too casually with the problem:
“So why did the participants in the passive conditions show little or no cue recruitment? As mentioned in the Introduction, Orhan et al. (2010) have argued that there must be a mechanism for determining which cues can be combined to create a meaningful interpretation of the sensory array. In the context of this study it would appear that passive viewing of the rotating object and the contingent arrows, does not satisfy this mechanism's requirements. This is perhaps because the arrows are regarded as extrinsic to the stimulus and hence unfavored for recruitment (Jain et al., 2014).
This is as weak and evasive an argument as could possibly be made in a scientific paper. The authors as why the arrow “cue” itself didn’t have an effect. They answer that it didn’t have an effect because it doesn’t satisfy the unknown requirements of an unknown mechanism that is nevertheless presumed to exist. So if a putative cue “works,” it proves the mechanism exists, and if a putative cue doesn’t work, it shows the mechanism is uninterested in it. Thus the cue theory is a classic case of an unfalsifiable, untestable proposition. It is merely assumed and data uncritically interpreted in that light.
The bottom line here is that the failure of the arrows to act as “cues” contradicts the investigators predictions, and they don’t know why. Which begs the question of why they planned an experiment containing what at the beginning they must have considered a serious confound? The failure of the arrows to cue the percept constitutes a serious challenge to their underlying assumptions, and needs to be addressed.
The authors’ further rationalization, that “This is perhaps because the arrows are regarded as extrinsic to the stimulus and hence unfavored for recruitment” begs the question, regarded as extrinsic by whom? The conscious observer? This leads, again, to the possibility of response bias.
But Wallis and Backus have their own response bias to the suggestion of response bias in their subjects: “We regard cue-recruitment as a cognitively impenetrable, bottom-up process….”
Thinking this is one thing, corroborating it another. The use of perceptually unstable stimuli producing temporally limited effects reliant on memory and forced choice responses isn’t a method designed to guard against potential response bias, but rather one that offers fertile ground for it. The convenience of dichotomous responses for data analysis can’t offset these disadvantages.
Short version: The possibility of response bias has in no way been excluded.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-
- Feb 2018
-
europepmc.org europepmc.org
-
On 2016 Nov 12, Lydia Maniatis commented:
The degree of uncertainty incorporated into this study in the form of confounds means that the claims at the front end carry no weight.
Essentially, the authors apparently are employing a forced choice paradigm. (They don’t refer to it as such, but rather as a “dichotomous perceptual decision.” Their stimulus is bistable, unstable, briefly presented, temporally decaying and the response relies on memory as it occurs after the image has left the screen. Their training procedure likely produces expectations that may bias outcomes.
The highly unstable nature of the Necker cube, even in static form, is self-evident. I don’t know if this is mitigated by motion, but I doubt it. I would expect the uncertainty to be even greater when the square face of the figure isn’t in a vertical/horizontal orientation.
In their discussion, the authors address the possibility of response bias in their study: “Firestone and Scholl (in press)…include a section on action-based influences on perception. The authors argue that much of this literature is polluted with response bias and that suitable control studies have undermined many of the earlier findings.”
Wallis and Backus counter this possibility with a straw man. “If participants were trying to respond in a manner they thought we might expect, there is no reason why they would not have done so in the passive conditions…”
However, the question isn’t only whether participants were trying to meet investigator expectations, but whether they had developed expectations of their own based on the “training” procedures.
In the so-called passive training condition, an arrow, either congruent or incongruent, was associated with the rotation of a disambiguated Necker cube. However, in this condition observers have no incentive to pay attention to this peripheral form and its connection with the area of interest. In the active condition, in contrast, it is necessary attend the arrows and to act on them. This obligation to act on the arrows while observing the figure ensures that attention is paid to their connection with cube rotation.
The conceptual and methodological uncertainty is compounded by the fact that the authors themselves can’t explain (though they presumably expected it) the failure of the arrows alone to produce a perceptual bias. As with the previous issue, they dispense too casually with the problem:
“So why did the participants in the passive conditions show little or no cue recruitment? As mentioned in the Introduction, Orhan et al. (2010) have argued that there must be a mechanism for determining which cues can be combined to create a meaningful interpretation of the sensory array. In the context of this study it would appear that passive viewing of the rotating object and the contingent arrows, does not satisfy this mechanism's requirements. This is perhaps because the arrows are regarded as extrinsic to the stimulus and hence unfavored for recruitment (Jain et al., 2014).
This is as weak and evasive an argument as could possibly be made in a scientific paper. The authors as why the arrow “cue” itself didn’t have an effect. They answer that it didn’t have an effect because it doesn’t satisfy the unknown requirements of an unknown mechanism that is nevertheless presumed to exist. So if a putative cue “works,” it proves the mechanism exists, and if a putative cue doesn’t work, it shows the mechanism is uninterested in it. Thus the cue theory is a classic case of an unfalsifiable, untestable proposition. It is merely assumed and data uncritically interpreted in that light.
The bottom line here is that the failure of the arrows to act as “cues” contradicts the investigators predictions, and they don’t know why. Which begs the question of why they planned an experiment containing what at the beginning they must have considered a serious confound? The failure of the arrows to cue the percept constitutes a serious challenge to their underlying assumptions, and needs to be addressed.
The authors’ further rationalization, that “This is perhaps because the arrows are regarded as extrinsic to the stimulus and hence unfavored for recruitment” begs the question, regarded as extrinsic by whom? The conscious observer? This leads, again, to the possibility of response bias.
But Wallis and Backus have their own response bias to the suggestion of response bias in their subjects: “We regard cue-recruitment as a cognitively impenetrable, bottom-up process….”
Thinking this is one thing, corroborating it another. The use of perceptually unstable stimuli producing temporally limited effects reliant on memory and forced choice responses isn’t a method designed to guard against potential response bias, but rather one that offers fertile ground for it. The convenience of dichotomous responses for data analysis can’t offset these disadvantages.
Short version: The possibility of response bias has in no way been excluded.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2016 Nov 12, Lydia Maniatis commented:
Two (probably) final points regarding the authors' conclusion quoted below:
"In conclusion, the present study presents evidence that a voluntary action (arm movement) can influence visual perceptual processes. We suggest that this relationship may develop through an already functional link between motor behavior and the visual system (Cisek & Kalaska, 2010; Fagioli et al., 2007; Wohlschläger & Wohlschläger, 1998). Through the associative learning paradigm used here, this relationship can be modified to enable arbitrary relationships between limb movement and perceived motion of a perceptually ambiguous stimulus. "
First, most stimuli are not perceptually ambiguous (i.e. they are not bistable or multistable), so the relevance of this putative finding is questionable in practice, and would require much more development in theory.
Second, the claim that it is possible to construct "arbitrary relationships between limb movement and perceived motion of a perceptually ambiguous stimulus" is a radical behaviorist claim, of a type that has consistently been falsified both logically and empirically.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2016 Nov 12, Lydia Maniatis commented:
It occurs to me that the "cue recruitment theory" is susceptible to the problem of infinite regress. If percepts are by their nature ambiguous, and require "cues" to disambiguate, then aren't the cues, which are also perceptual articles, also in need of disambiguation? Don't we need to cue the cue? And so on....
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2016 Nov 21, Guy M Wallis commented:
Lydia raises an important question. Surely we can't learn everything! We need something to hang our perceptual hat on to get the ball rolling. After all, in the experiments described in our paper, Ben and I relied on the presence of a trusted cue - occlusion implying depth - to allow the observer to harness the new cue which we imposed - arm movement. But where did knowledge of the trusted depth cue come from? Did we have to learn that too? Well there is considerable empirical evidence that humans do acquire knowledge of how depth cues work from experience. We observe objects as we approach or touch them and learn about how they change in size, aerial perspective, linear perspective etc. But it also seems likely that some cues have been acquired in phylogentic time due to their reliability and utility. The apparently in-built assumption that lighting in a scene comes from above and the left may be an example of this. In the end though, one might not need much to allow a self-organising system of cues to rapidly 'boot-strap' itself into a robust system in which myriad sensory cues are integrated optimally.
Lydia and my co-author, Benjamin Backus, have been engaged in a lively and informative exchange on PubPeer which I recommend to those interested in this debate. The concept of cue recruitment throws up a lot of concerns and queries.
https://pubpeer.com/publications/2622B45C885243AFCB5C604CB0638B
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY. -
On 2016 Nov 24, Lydia Maniatis commented:
As should be evident from the corresponding PubPeer discussion, I have to disagree with all of Guy’s claims. I think logic and evidence are on my side. Not only is there not “considerable evidence that humans acquire knowledge of how depth cues work from experience,” the evidence and logic are all on the opposite side. The naïve use of the term “object” and reference to how objects change “as we approach or touch them and learn about how they change in size, aerial perspective, linear perspective etc” indicates a failure to understand the fundamental problem of perception, i.e. how the proximal stimulus, which does not consist objects of any size or shape, is metamorphosed into a shaped 3D percept. Perceiving 3D shape presupposes depth perception. As Gilchrist (2003) points out in a critical Nature review of Purves and Lotto’s book, “Why we see what we do:” “Infant habituation studies show that size and shape are perceived correctly on the first day of life. The baby regards a small nearby object and a distant larger object as different even when they make the same retinal image. But newborns can recognize an object placed at two different distances as the same object, despite the different retinal size, or the same rectangle placed at different slants. How can the newborn learn something so sophisticated in matter of hours?” Gilchrist also addresses the logical problems of the “learning” perspective (caps mine): “In the 18th C, George Berkeley argued that touch educates vision. However, this merely displaces the problem. Tactile stimulation is even more ambiguous than retinal stimulation, and the weight of the evidence show that vision educates touch, not vice versa. Purves and Lotto speak of what the ambiguous stimulus “turned out to signify in past experience.” But exactly how did it turn out thus? WHAT IS THE SOURCE OF FEEDBACK THAT RESOLVES THE AMBIGUITY?” “Learning” proponents consistently fail to acknowledge, let alone attempt to answer, this last question. As I point out on PubPeer, if touch helps us to learn to see, then the wide use of touchscreens by children should presumably compromise 3D perception, since the tactile feedback is presumably indicative of flatness at all times.
The confusion is evident in Guy’s reference to the “trusted cue – occlusion implying depth.” Again, there is a naïve use of the term “occlusion.” Obviously, the image observers see on the screen isn’t occluded, it’s just a pattern of colored points. With respect to both the screen and the retinal stimulation, there is no occlusion because there are no objects. Occlusion is a perceptual, not a physical, fact as far as the proximal stimulus is concerned. So the cue itself is an inferred construct intimately linked to object perception. So we’re forced to ask, what cued the cue…and so on, ad infinitum. Ultimately, we’re forced to go back to brass tacks, to tackle figure ground organization via general principles of organization. Even if we accepted that there could (somehow) be unambiguous cues, we would still have the problem that each retinal image is unique, so we would need a different cue - and thus an infinite number of cues- to handle all of the ambiguity. Which makes the use of “cues” redundant.
So the notion that “one might not need much to allow a self-organising system of cue to rapidly ‘boot-strap’itself into a robust system in which myriad sensory cues are integrated optimally” is clearly untenable if we try to actually work through what it implies. The concept of ‘cue recruitment’ throws up a lot of concerns only because even its provisional acceptance requires that we accept unacceptable assumptions.
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.
-