Visual processing does not occur in passive systems. We and our evolutionary ancestors are and were acting creatures, and our visual systems evolved to enable effective locomotion and action. Indeed, sensation of the external world is only evolutionarily meaningful to the extent that the sensing organism can respond to the input (accounting for the dearth of eyes in the plant kingdom), and if vision’s purpose is action, our understanding of the visual brain is likely to be well served by studying vision in an action context. Even for our more sedentary activities, like reading, watching TV, or chatting with a friend, our eyes are in constant motion, actively gathering information. We are virtually incapable of passive vision. Our goals in this chapter are to highlight some of the ways in which seeing is coupled to action and to show that our understanding of visual processing can be enhanced by considering its relationship to action. To make our case, we will demonstrate that action has access to visual information that perception does not, and we will demonstrate that actions and action plans influence what we consciously perceive.
We will begin the chapter with an overview of cortical substrates for visuomotor processing and prevailing theories about the division of visual-processing labour in the cortex. We will then discuss findings from neuropsychological and behavioural studies and what they have taught us about the complex and sometimes counter-intuitive relationship between vision and action. Our approach might be considered ecological to the extent that it stresses vision’s tight coupling with action processing. Indeed, our analysis focuses mainly on functions that are considered to be under the purview of the visuomotor ‘dorsal stream’ in the posterior parietal cortex and, as we discuss in the next section, this visual stream is thought to be involved in the direct transformation of vision into action. It has previously been suggested that the visual functions of the dorsal stream might be those for which an ecological approach to vision is appropriate, in contrast to the visual functions of the perceptual ‘ventral stream,’ for which a constructivist approach to vision may be more fitting . We are sympathetic to this view and to the idea that a greater respect for vision’s behavioural role will assist future vision science. And though we will not be endorsing Gibson’s direct perception approach to vision  or declaring allegiance to a strictly ecological approach to vision, the spirit of our argument will echo J.J. Gibson’s contention that vision relies on the moving, acting person in their environment.
2. Cortical substrates for visuomotor processing
In the human primary visual pathway, visual input from the retina proceeds, via the lateral geniculate nucleus, to primary visual cortex (V1), located in the occipital lobe. Visual information then proceeds to extrastriate visual areas in the occipital lobe, posterior parietal lobe, and temporal lobe. Our discussion will focus mainly on functional areas within the posterior parietal cortex (PPC), which are thought to be involved in the preparation and control of visually guided actions. Some of the areas of the PPC that will be particularly relevant to the discussion are the parietal-occipital junction (POJ), the superior parietal lobule (SPL), the intra-parietal sulcus (IPS), and the inferior parietal lobule (IPL). Studies in patient populations have shed light on the functions of these areas, and we address some of this work later in the section. First, however, we discuss three theories of cortical organization for visual processing that have significantly advanced our understanding of extrastriate processing: Mishkin, Ungerleider, and Macko’s  two visual stream hypothesis, Goodale and Milner’s [4,5] Perception-Action Model (PAM), and Glover’s  Planning-Control Model (PCM). These theories all posit branching paths of visual output from V1, and their differences lie in the functions they assign to each visual processing stream. The PAM and the PCM are particularly relevant to our thesis, for they highlight regions of parietal cortex devoted to the translation of vision into action.
Mishkin, Ungerleider, and Macko  proposed that the primate visual system is divided into two cortical streams: a dorsal stream projecting from primary visual cortex to posterior parietal cortex (PPC) and a ventral stream projecting from primary visual cortex to inferior temporal cortex (IT). They suggested that the dorsal stream is responsible for processing visual information relating to object location (‘where’), while the ventral stream is responsible for processing visual information relating to object identity (‘what’). This functional division was supported by evidence from monkeys with lesioned PPC, who were impaired in their ability to select a target based on its relationship to a landmark object, and from monkeys with lesioned IT, who were impaired in their ability to select a target based on its shape and surface patterning.
Goodale and Milner [4,5] suggested an important modification to the two visual streams hypothesis. They argued that the functions of the two streams should be considered in terms of the purpose for which the visual information is being processed. According to the PAM the dorsal stream processes visual information for action (‘how’) and the ventral stream processes visual information for perception (‘what’ and ‘perceptual where’). One of the key pieces of evidence for Milner and Goodale’s model was the perceptual and motor performance of D.F., a patient with visual form agnosia. D.F.’s ability to identify objects and their shapes is dramatically impaired. However, her ability to reach to and grasp objects is largely preserved. For example, D.F. can accurately rotate a card to fit through a slot during a posting action, but will fail to perceptually report the orientation of the slot . D.F. suffered damage to her lateral occipital cortex (LOC)  in the ventral stream but has a largely intact PPC, and Milner and Goodale  inferred that the PPC was responsible for her preserved visuomotor function. Milner and Goodale contrast D.F.’s preserved motor abilities and pattern of cortical damage with those of patients with optic ataxia (a condition we describe in more detail later), who experience impaired goal-directed action and tend to have lesions to areas within the PPC .
Additional support for the PAM was provided by studies in non-patient participants interacting with visual illusions, which were shown to fool perceptual reports but not goal-directed actions [e.g. 9,10]. For instance, when participants responded to the central circle in the Ebbinghaus illusion, a size-contrast illusion in which the central circle appears larger when surrounded by smaller circles, perceptual reports were more susceptible to the illusion than actions were . These findings were consistent with the idea that the ventral stream considers relationships among objects, driving the size-contrast illusion effects, while the dorsal stream considers only the action-relevant parameters of the target object. However, conflicting findings across studies and issues relating to task differences among perceptual and reach tasks ultimately weakened – though may not have defeated – the case from the illusion literature, and we refer the reader elsewhere for an extensive discussion of the illusion controversy [11,12].
Milner and Goodale’s  reformulation of the two visual streams hypothesis was possible thanks to a consideration of the relationship between action and vision. They shifted the focus from the kinds of visual information (spatial vs. identity) to the behavioural role of the visual information (acting vs. perceiving). Thus, Milner and Goodale’s  model provides an example of how thinking about action advanced our understanding of vision.
Glover’s  PCM, like the PAM, also emerged from a focus on the role of action in visual processing. Indeed, it shares much in common with the PAM; both models assign perceptual/cognitive and visuomotor processes to different cortical streams. Where the PCM differs from the PAM is in its proposal that different visual information is used for movement planning than for movement control and in its contention that distinct cortical areas serve as substrates for the two phases of action. To assist in clarifying the differences between the PCM and the PAM, we return, briefly, to the PAM and its view of movement preparation and control.
According to the PAM, the role of the ventral stream is to permit a high-level understanding of one’s visual environment and the relationships among the objects within it. Information derived from ventral stream processing allows one to select, based on current goals, an object for action. Once this is done, control is passed to the dorsal stream, which carries out the specification of the movement parameters and monitors online performance. This action preparation and online control is, according to Milner and Goodale , carried out within the superior parietal lobe (SPL) and the intraparietal sulcus (IPS).
PCM’s differences from the PAM are related primarily to what factors influence movement preparation and what cortical areas are responsible for different kinds of visuomotor processing. The PCM proposes a third stream, in the inferior parietal lobe (IPL), that is responsible for movement preparation. This stream, according to Glover , considers non-spatial target factors (e.g. object weight) and contextual elements (e.g. background motion), which inform -- and, in the case of visual illusions, fool -- the initial preparation of the movement. Whereas the PAM places both movement preparation and online control within the same ‘how’ stream in the SPL and IPS, Glover’s PCM separates the initial phase of movement production (in the IPL) from real-time control (in the SPL), and argues that separate representations underlie each phase of the movement. One of the pieces of behavioural evidence for the PCM was a careful examination of the unfolding movement during actions towards visual illusions. Glover and Dixon  showed that the effect of an illusion on grip aperture was stronger at the start of the movement and diminished as the movement progressed, potentially indicating that movement planning and movement control were drawing upon different representations of the target and its environment.
Glover  is not alone in arguing for a third visual stream. Rizzolatti and Matelli  have argued for a division of labour within the dorsal stream: a dorsal-dorsal stream in the SPL that is responsible for online control and a ventral-dorsal stream in the IPL that is involved in both action and perception. Pisella et al.  have also argued for more than two visual streams. We will encounter some of the evidence for a divided dorsal stream in the section on optic ataxia.
2.1. Evidence from patient populations
We turn next to visual deficits in patients who have suffered damage to different areas of visual cortex. We have chosen to focus on three conditions: blindsight, because it provides a dramatic example of vision for action without conscious awareness; optic ataxia, because it involves damage to the PPC and is informative with regard to action processing in that region; and hemispatial neglect, because it provides an important contrast to optic ataxia and also provides insight into the role of attention in action.
The term ‘blindsight’ is typically used to describe the phenomenon in patients with lesioned V1 who report being unaware of objects in their blind visual field yet remain able to access some visual information about objects presented within it. For instance, patients can locate stimuli in their blind field for which they do not report any conscious awareness . There has been considerable debate regarding the implications of blindsight for conscious visual processing and whether incomplete lesioning of V1 can explain performance in blindsight patients (see [17,18] for reviews). We will not outline the debate here; rather, we wish to highlight the important point that several of the behaviours observed in blindsight -- looking at a target , pointing to a target , or even reaching to post a letter through a slot  -- involve acting without consciously seeing. These behaviours provide examples of situations where action is able to access visual information that perception cannot. The phenomenon is analogous to the one described in the visual agnosia patient D.F., who can interact with objects of which she is perceptually unaware . Indeed, Milner and Goodale  have previously drawn attention to the implications of blindsight for the PAM, and have suggested that projections from subcortical structures to areas in the PPC may permit the preserved action in blindsight. We refer the reader to that text for a more detailed discussion of the evidence.
A recent blindsight study by Whitwell et al.  does provide indirect support for Milner and Goodale’s  PAM model. Whitwell et al.  examined real-time and delayed grasping performance in a blindsight patient (S.J.) when she reached to targets presented in her blind field. They found that S.J. scaled her grip aperture to target size when movements were initiated while the target was ‘visible’ (i.e., not occluded by the experimenter), but failed to scale her grip aperture when the target was occluded 2 seconds prior to the imperative stimulus. Furthermore, S.J. was incapable of perceptually reporting the size of the target presented in her blind field. These findings are consistent with Milner and Goodale’s  suggestion that the dorsal stream has no memory and only processes currently available visual information; however, this inference does rely on the assumption that dorsal stream processing is preserved in S.J.
Independent of any dorsal/ventral considerations, blindsight studies show that goal-directed actions can uncover visual function that perceptual reports may not. Action’s ability to tap into non-conscious vision has also been observed in non-patients: In a later section we describe a behavioural study  that reveals movement scaling to perceptually-inaccessible target size, an effect that mimics the non-conscious reach performance in blindsight. That study suggests that the putative dorsal-stream processing in blindsight patients naturally occurs in non-patient participants, whose visuomotor systems are able to see what perception cannot.
2.3. Optic ataxia
Optic ataxia is a motor disorder characterized by deficits in goal-directed reaching, and neuropsychological investigations of the brain lesions associated with optic ataxia have contributed considerably to our understanding of visuomotor processing. However, the nature of the disorder is complex, and some of its implications for our understanding of the visuomotor dorsal stream are not yet clear.
One of the main pieces of evidence presented by Milner and Goodale  for the PAM is the proposed double-dissociation between visual agnosia and optic ataxia. The preserved motor function and impaired object perception in D.F., who has lesioned ventral stream and intact PPC, contrasts with the impaired motor function and preserved object perception in patients with optic ataxia, who have intact ventral streams and lesioned PPC. Other researchers, however, have questioned the validity of this double-dissociation [15,6].
Part of the difficulty with interpreting performance in optic ataxia stems from the observation that motor performance to targets presented in central vision is often comparable to that of controls; major performance deficits appear only when targets are presented in the visual periphery . In other words, when patients are able to fixate the target, they can accurately reach to it. However, there is evidence that subtle movement deficits can be detected when the target is in central vision. A study by Pisella et al. , for instance, showed that an optic ataxia patient, I.G., who has bilateral lesions to PPC was much slower and less fluid than controls in correcting her movements online when a target in central vision was displaced during a reach (though this effect, too, may be explained by the central/peripheral distinction, for the displacement moved the target away from fixation). Such findings have been taken to indicate that the dorsal stream may be more important for on-line movement control than it is for the initial parameterization of movements [25,6].
Milner and Goodale  have outlined some evidence that counters this view of an ‘on-line only’ dorsal stream, noting preparation deficits in optic ataxia as well as the preserved movement preparation, not just preserved on-line control, in the visual agnosia patient D.F. However, Milner and Goodale  do not directly address the central vs. peripheral vision discrepancy observed in optic ataxia [15,25] leaving the optic ataxia/visual agnosia double-dissociation question unresolved. At the very least, however, research on optic ataxia suggests that regions within the SPL are involved in transforming visual input to motor output. Whether the SPL’s function is restricted to on-line visuomotor processing has yet to be determined.
More recently, Pisella et al.  have suggested that the evidence from optic ataxia indicates that one of the key functions of the dorsal stream is the spatial coding of targets in an eye-centered coordinate frame. Pisella et al.  assign this spatial coding function specifically to the parietal-occipital junction (POJ), a common lesion site in patients with optic ataxia. This account helps explain the peripheral target deficit observed in optic ataxia. Pisella et al.  further argue that dorsal stream function is important for both action and perception. This claim is supported by a recent study , which showed that optic ataxia patient I.G. was not only impaired in her on-line responses to a target displacement, but that she was also impaired in her perceptual report of the same target displacement.
Pisella et al.  also raise the important point that the dorsal stream probably has some role in perception for another reason: Areas within the dorsal stream are thought to be involved in attention orienting, which is fundamental to perceptual processing. Later in the chapter we address the important links between attention, perception, and action. In anticipation of that discussion, we provide first an overview of hemispatial neglect, a disorder of attention and spatial representation that, like optic ataxia, typically results from damage to regions within the parietal cortex.
2.4. Hemispatial neglect
Patients with hemispatial neglect suffer from a tendency to ignore half of their visual field, failing to acknowledge or interact with objects in the neglected field unless strongly encouraged to do so. Their performance deficits are generally considered distinct from those of optic ataxia patients, and the reaching deficits in optic ataxia are thought to be related to damage to the SPL or POJ, while the performance deficits in neglect are thought to be related to damage to the IPL .
Hemispatial neglect theoretically provides an interesting comparison to optic ataxia, for it should allow researchers to examine the relationship between attention and visuomotor control without the complicating visuomotor deficits present in optic ataxia. It also allows researchers to examine the impact, or lack thereof, of impaired visual awareness on motor function. However, one of the challenges researchers face when interpreting performance in neglect patients lies in ruling out the possibility that any visuomotor deficits observed in these patients arise from cortical damage that extends into visuomotor areas. For instance, Himmelbach and Karnath  have suggested that superior temporal cortex, rather than IPL, is directly responsible for the deficits of perceptual space representation found in neglect, and that the motor deficits found in some patients with neglect might stem from damage that extends to the IPL, a region they argue is involved in spatial coding for motor function, but which is not involved in the cognitive spatial coding that characterizes neglect. More recently, Himmelbach et al.  have argued that the neglect-specific effects of space representation are specifically linked to lesion sites at the superior temporal gyrus and temporo-parietal junction. They have also suggested that real-time motor control functions, such as those observed in optic ataxia, are supported by the POJ, an argument that aligns with Pisella et al.’s .
As mentioned earlier, the motor deficits of optic ataxia are particularly prominent when participants reach for targets presented in the visual periphery. This contrasts with the visuomotor performance of patients with neglect, who generally exhibit accurate motor performance to objects presented in their neglected field . Although motor deficits have been observed in neglect patients, these tend to be relatively minor compared to the deficits of optic ataxia patients. One of the motor performance deficits that has been found in neglect patients is a delay in the initiation of reaching movements into their neglected visual field. Some studies also indicate minor impairments in online performance, whereas others show an absence of any deficits in online control (see  for a review). Himmelbach et al. [30,31] argue that when a proper control group is used (i.e., patients with parietal damage who do not exhibit neglect), hemispatial neglect is not associated with any impairments in movement control. However, a recent study by Rossit et al.  suggests that neglect patients may be slower to correct their movements online compared to both healthy controls and right hemisphere patients without neglect.
In the Rossit et al.  study, the authors applied a target jump design modeled after Pisella et al.’s  design, in which a participant is tasked with either going to a target when it is displaced at movement onset (location-go) or trying to stop their movement as soon as the target is displaced (location-stop). The location-stop condition allows the researcher to probe the automaticity of the online corrections; any deviations toward the target that occur in this condition can be attributed to automatic online control. In the location-go condition, neglect patients were slower than the control groups, by 80-100ms, to correct their movements online when the target jumped into their neglected field. Endpoint accuracy, however, was equivalent across the groups. In the location-stop condition, neglect patients exhibited an equivalent number of online corrections to the control groups. The neglect patients, in fact, had greater difficulty stopping their movements than the participants in the control groups. These results suggest that the ‘automatic pilot’ is intact in neglect patients. However, the results also suggest that visuomotor processing in the neglected field is slowed, perhaps as a result of impaired attention-for-action in that field.
Some authors have suggested that optic ataxia and hemispatial neglect represent a double dissociation (e.g. [31,32]). In support of this view, a recent case study showed that real-time grasping was preserved in a neglect patient’s neglected field, whereas delayed grasping was dramatically impaired . These outcomes are the inverse of those in optic ataxia patient I.G., who exhibits impaired real-time grasping but actually improves when asked to execute a delayed pantomime grasp .
The relative absence of major motor deficits in hemispatial neglect provides a further piece of evidence for action having access to visual information that perception does not. As noted at the outset of this section, patients with hemispatial neglect have a failure in the perceptual representation of part of their visual world. Their visuomotor system’s preserved ability to reach to and grasp objects within this neglected field is suggestive of a perception/action dissociation, though it may not fall precisely along ventral/dorsal lines. A recent review by Harvey and Rossit  provides a comprehensive overview of visuomotor function in hemispatial neglect, and we direct the interested reader there for a fuller account of the syndrome’s complexities and its implications for the functional organization of parietal and temporal cortex.
2.5. Section summary
We overviewed three theories of cortical organization for visual processing and discussed neuropsychological findings from blindsight, optic ataxia, and hemispatial neglect. The picture that emerges is one of a modularized PPC, with current evidence favouring the POJ and the SPL as key sites for real-time visuomotor computations. Critically, the visuomotor processing carried out by these areas appears to proceed automatically, without mediation by conscious visual processing. These sites are implicated in direct visual-to-motor transformations, and they are areas whose visual functions can only be probed by engaging participants in goal-directed movement tasks.
The visuomotor role of the IPL is somewhat less clear. It is a common lesion site in hemispatial neglect, which may implicate it in the orienting of attention. Glover  has argued that the IPL is important for movement planning, and neglect patients with damage to that area do tend to exhibit more motor deficits than neglect patients with undamaged IPL , which provides some support for Glover’s assertion. At the same time, the breakdown in the cognitive spatial representation that is associated with damage to superior areas of the temporal cortex , a spatial deficit that does not appear to undermine action control, is consistent with Milner and Goodale’s  argument for different spatial representations for perception and action.
The findings from patients with cortical lesions to visual areas support the idea that vision-for-action can proceed independently of vision-for-perception, though the possibility remains that the effects observed in patients do not represent the normal function of the preserved cortical areas. In the following section we examine converging evidence from non-patient participants for the idea that vision-for-action can access information that perception does not.
3. Action can proceed without perception: Evidence from cortically-intact participants
In this section we provide further evidence for action processing without visual awareness. We focus on studies in non-patients, which show that even when all areas of visual cortex are intact, visual information that drives action can elude conscious detection. This suggests that action’s access to unperceived visual information is part of normal visual processing. We examine evidence from three different paradigms: backward masking, saccadic suppression of target displacement, and motor adaptation. In each case, motor responses to events are not only possible, but do not appear to suffer as a result of suppressed visual awareness.
3.1. Evidence from backward masking
One of the ways that non-conscious visual processing has been investigated is by masking a response-relevant stimulus and observing its impact on behaviour. When a stimulus is successfully masked, the participant does not report awareness of it. In metacontrast masking, for instance, the stimulus to be masked (the prime) is presented and then, shortly after, a larger stimulus (the mask) is presented around the prime. This sequence of stimuli can eliminate participants’ awareness of the prime while influencing motor responses .
Taylor and McCloskey , for example, used metacontrast masking and showed that the reaction times for a motor task were influenced by the unseen prime. When a light was briefly flashed and then, 50ms later, 4 lights that closely surrounded the location of the first light were flashed, thereby producing a metacontrast mask, participants’ reaction times were linked to the presentation of the initial stimulus, in spite of their having failed to consciously report its presence. Furthermore, Cressman et al.  have shown that movements that have already been initiated can be influenced by an unseen directional prime, such that participants adjust their movement online. In that study, a directional prime (left arrow, right arrow, or neutral stimulus) was presented at movement onset and then quickly masked with a larger arrow. Participants’ movement endpoints were dictated by the mask, but the unseen directional primes triggered substantial trajectory deviations ahead of the explicit response to the mask.
These results suggest that the motor system can respond to visual information that is inaccessible to conscious awareness. However, this does not necessarily imply that the prime information is being processed by the dorsal stream. In fact, when the prime is a symbol that must be translated into a directional response (e.g. [36,38]), it is likely that ventral stream processing is involved. The perceptual representation of the shape may fail to reach awareness, but it is a representation that has the potential to be perceived, which, as Milner and Goodale  argue, should still be classified as ventrally-mediated.
However, when the masked stimulus is, itself, the target of the action, direct involvement of the dorsal stream is more likely. In a study by Binsted et al. , participants were tasked with making aiming movements directly to a masked target, the size of which was manipulated across trials. The study showed that movement times were scaled to the size of the target (shorter times for larger targets, longer ones for smaller targets), in accordance with Fitts’ Law. Thus, even though participants did not consciously perceive any changes in the size of the target, their motor responses were appropriately tuned to it. This study showed that healthy participants could experience a blindsight-like ability to scale their visuomotor response to something they could not consciously see. Because the visual information that action is drawing upon in this instance is presumably the same information that it would be using in the absence of the mask, we can infer that visual processing for immediate action control is not normally mediated by conscious vision. Action may have access to sub-threshold conscious vision or it may draw upon different visual information altogether, as suggested by the PAM. Thus, either as a matter of degree of visual input to which they are sensitive or kind of visual input upon which they rely, vision-for-action and vision-for-perception clearly differ.
3.2. Evidence from reaches to saccadically-suppressed target displacements
We consider next a very robust dissociation between perception and control that occurs when people make simultaneous eye and hand movements. We will take up the perceptual effects of saccadic eye movements in more detail in a later section. For the current section, one need only know that when a target is displaced during a saccadic eye movement, the displacement is largely invisible. Surprisingly, people fail to notice a change in the target’s location even when the displacement is as large as one third of the saccade magnitude .
Bridgeman et al.  showed that when participants pointed to a target that had been displaced during a saccade, they could accurately acquire it, even though they were unaware of the change in location. Goodale et al.  and Pelisson et al.  demonstrated that online responses of the motor system were also sensitive to saccadically-displaced targets. They showed that even when participants had initiated a reach towards a target’s pre-saccadic location, the reach smoothly updated itself to acquire the displaced post-saccadic target location. This adjustment to the reach occurred in spite of participants having no vision of their hand and no awareness of the target displacement. This effect was also shown for targets that were displaced tangentially to the primary axis of the movement . In sum, awareness of a target displacement is not needed for motor adjustments to the displacement.
3.3. Evidence from motor learning in response to unperceived visual changes
When people encounter an altered visual environment, they adapt their movements over the course of exposure to it, such that initially inaccurate movements gradually improve. For example, if one were to view the world through displacing prisms that shifted the visual world to the right, one’s movements would initially err to the right of a reach target. Visual feedback would allow correction of this error over the course of multiple movements. Subsequent removal of the prisms would then produce motor errors in a leftward direction (‘aftereffects’) as a result of the newly acquired mapping between vision and motor output.
The interesting effect for the purposes of the current discussion is that people can acquire new visuomotor mappings without any awareness that their visual environment has changed. In fact, learning appears to be more robust if people do not know that the environment has been altered. Michel et al. , for instance, showed that gradually incrementing the amount of prism shift, such that participants were unaware of it, led to stronger aftereffects than the introduction of a large, consciously detectable prism shift.
People can also adapt to systematic, imperceptible changes in a target’s location between the start and end of their movements. This adaptation can occur when the movement error is presented at the end of a reaching movement , but it can also occur when the target is displaced during the reaching movement, allowing for online corrections that eliminate any visual error at the end of the reach [46, cf. 47]. Furthermore, if participants are made aware of the target displacement, the amount of adaptation is considerably diminished .
The adaptation effect for reaching movements to displaced targets is similar to the adaptation that occurs for eye movements. Saccadic adaptation is a well-documented phenomenon in which the size of saccades gradually increases (or decreases) when people are repeatedly exposed to forward (or backward) displacements of the target . This effect, like the one for reaching movements, is thought to draw upon the natural calibration of our movements that occurs throughout our everyday lives, a process that typically occurs without any awareness of the error in our movements.
3.4. Section summary
Conscious perception of changes in the visual environment is required for neither real-time control nor motor learning. In fact, motor learning may even be enhanced if one is unaware that a change has occurred. These findings do not necessarily imply that vision for perception and vision for action rely on separate cortical streams, but they do show that what action sees is not necessarily what perception sees. This is an important point, for it suggests that the principles governing vision for perception may differ from those governing vision for action. By measuring motor responses, not just perceptual reports, we can tap into a wealth of visual processing that we might otherwise miss.
4. Action influences visual attention and perception
So far, in discussing topics such as the PAM, blindsight, and masking studies in healthy participants, we have devoted much attention to the phenomenon of acting without consciously seeing. In this section we turn our attention to perception, and examine some of the ways that the intention to act changes what we see.
4.1. Saccades in action
Perhaps the most obvious example of the link between action and perception is eye movements. To pick up detailed information about the world around us, we constantly re-orient our gaze via movements of our eyes and head. Saccades, which are fast and largely ballistic, are the most common type of eye movement, and much of our internal representation of the visual world is constructed from the detailed snapshots they provide. While it is probably not surprising to many of us that saccades are constantly being used to shift our gaze and thus inform perception, saccades also influence perception in other more subtle ways.
One perceptually subtle (but experimentally dramatic) effect of saccades is their ability to mask large changes in the visual scene. As previously mentioned, saccade targets can be displaced by distances as large as a third of the saccade magnitude without the participant reporting any change . Entire objects can be rotated or even deleted from a scene during a saccade, and participants will fail to notice the change . In short, saccadic eye movements introduce periods of change blindness. This effect is thought to be partly due to our visual system’s built-in assumption that the world is stable and that trans-saccadic changes in object locations are more likely to result from eye movement errors than they are to result from actual changes in the scene . Ironically, then, the perceptual effect of saccadic suppression is a no-percept effect; suppression serves to keep the visual world stable and our conscious perception of it unperturbed. This demonstrates not only that oculomotor plans influence perception, but also that our action-driven visual system is carefully tuned to compensate for perturbations that are caused by internally generated movement.
4.2. Action goals dictate where our eyes go
When we reach to, pick up, and use objects to accomplish goals, our eyes precede our manual actions, orienting to the relevant parts of relevant objects. Land et al.  tracked people’s eye movements as they carried out the actions of brewing a cup of tea in a kitchen, and the researchers observed that people’s eye movements were tied to the behavioural goals; the eyes did not jump from one visually salient object to another but, rather, moved deliberately from one task-relevant object to the next. Detailed analysis of eye-hand coordination during object grasping and manipulation tells the same story: the eyes are drawn to contact points between the hand and the object and between the manipulated object and other objects [53,54].
Furthermore, the coupling between the eyes and the hand appears to be quite strong, and will resist conscious attempts to break it. For instance, when people are told to look and point to a target and then move their eyes to a new saccade target that appears while the hand is in flight, they fail to complete the saccade task. The eyes remain locked on the target of the reaching movement until the hand has landed . Thus, the eyes strategically move to pick up relevant information for goal-directed action, and they are tightly bound to this task. The coupling between eye and hand is a perfect example of action’s role in dictating where and when we acquire visual information from our environment.
As much as the eyes may want to lead the hand, it is possible to override the coupling by fixating the eyes in one location prior to initiating a reach to a peripheral target. The task requires some effort on the part of the performer, but it can be done (and is, in fact, well employed in laboratory settings when tight control over visual input is desired). You may have noticed, for instance, that you can reach for a cup of coffee while keeping your eyes on the book or screen before you, though at some cost to movement accuracy. As the next sections will show, however, even when the eyes remain locked in place during a manual task, visual attention does not; it is bound to action goals.
4.3. On the relationship between action and attention
Attention is vital to our experience of the visual world. Most of us have probably experienced the frustrating search through a crowded restaurant in which we only see our dinner companions after having already walked past them once or twice, or the search that happens at the open fridge door, where the item we want, and cannot find, has been in front of us all along. Controlled experiments have shown that people will reliably fail to see large objects that disappear and reappear in blinking scenes  or even fail to see a person in a gorilla suit walk through the middle of a scene . Attention is the construct used to explain these effects. The idea is that there is far more information in the visual field than our brain can or wants to cope with at any one time. The brain, therefore, relies on attention to select a portion of visual information for analysis. And, as a result, if we do not attend to something, we are blind to it.
That attention is important for conscious perception is clear. When we consider, however, that the purpose of human information processing is not just perception but also action, it is also clear that attention systems should not be examined independently of action systems. One of the first to raise this point was Allport , who noted that the important constraint upon visual analysis of a scene may not be central processing limitations, but the need for action coherence. Allport’s  argument was that motor systems need to be tied to one object at a time; if visual information about multiple objects is permitted access to these systems simultaneously, the action will fail. The hand, for example, cannot successfully grasp a cup if the information guiding the reach is also coming from the apple, the bottle, and pencil sitting next to the cup.
The importance of action to the allocation of attention has also been stressed by Rizzolatti et al. , who proposed a premotor theory of attention, in which eye-movement motor programs drive the spatial allocation of visual attention. Tipper et al.  have likewise emphasized the role of action in attention, proposing that attention operates within an action-centered representation of visual space. Schneider , meanwhile, has proposed the Visual Attention Model (VAM), a framework in which a central attention mechanism binds perceptual and action systems to the same object. Each of these perspectives on the relationship between attention and action will be examined next.
4.4. Premotor theory
The premotor theory of attention has probably been the most influential of the action-based theories of attention. As initially proposed , premotor theory attributed the control of attention to oculomotor programming; even when the eyes remained still while attention was shifted (covert orienting), the attention shift was purportedly due to the programming of an eye movement that was subsequently inhibited. Premotor theory was later modified to allow for goal-directed motor programming of any kind (e.g. reaching) to produce attention shifts , but the basic premise remained the same. The mechanism underlying this process was, according to Rizzolatti et al. , the activation of neurons in spatial pragmatic maps.
These pragmatic maps are proposed to reside in brain areas associated with action (e.g., parietal reach areas; parietal, frontal, or sub-cortical eye movement areas), and they code space only insofar as it is relevant to the action that they are involved in programming. Thus, according to premotor theory, there is no higher-level attention system. Rather, attention shifts simply result from the selective activation of pragmatic map neurons, and this activation only occurs when a movement is programmed to that region of space.
Some of the strongest support for premotor theory can be found in neurophysiological studies. Moore and Fallah , for instance, showed a causal link between activation of eye movement cortex and the allocation of attention. Moore and Fallah stimulated monkeys’ frontal eye field (FEF), a cortical area involved in the control of voluntary eye movements. They began by stimulating a part of the FEF with enough current to trigger an eye movement. They then reduced the stimulation to a sub-threshold level (i.e., the stimulation was too low to trigger an eye movement). They found that this sub-threshold stimulation improved the monkey’s ability to detect a change in the target stimulus when the stimulus fell within the region of the visual field corresponding to the destination of the eye movement that had previously been triggered by supra-threshold stimulation of the FEF. In a similar study investigating the attentional role of the superior colliculus (SC) (a subcortical area directly involved in the control of eye movements), Muller, Philiastides, and Newsome  found that sub-threshold stimulation of the SC also produced enhanced detection of the target stimulus. Both of these studies demonstrate covert orienting of attention resulting from activation of oculomotor areas of the brain, consistent with premotor theory.
4.5. Action-centered attention
An important step in understanding attention is determining the nature of the spatial representation upon which it operates. Premotor theory suggests that spatial pragmatic maps underlie the allocation of spatial attention (though it also states that attention emerges from these maps, rather than operating upon them). A related view, advanced by Tipper et al. , suggests that attention operates upon an action-centered representation. To get a better sense of what such a representation might be, we will first consider other kinds of spatial representation.
Tipper et al.  outline 4 possible kinds of spatial representation upon which attention might operate: a 2-D retina-centered representation, a 3-D viewer-centered representation, an environment-centered representation, and an action-centered representation. A 2-D retina-centered representation is one in which the spatial relationships between objects are defined in terms of the objects’ relative positions in the 2-D retinal image. Thus, when it comes to attention, a distractor on the far side of a target (with respect to the viewer) would produce more interference than a distractor on the near side of the target, according to Tipper et al., because in the 2-D image the far object is closer to the target than the near object is. A 3-D viewer-centered representation, on the other hand, is one in which the distance of objects from the viewer is a relevant factor. If attention operates within this kind of representation, distractors on the near side of a target would potentially produce greater interference than distractors on the far side of the target. This type of representation differs from an environment-centered representation in that the orientation of the viewer with respect to the objects affects their salience. In the environment-centered representation, viewer orientation is irrelevant. Finally, the action-centered representation is one in which an object’s potential for interference depends upon its relationship to a planned action path. Thus, a distractor that resides within the action path will potentially produce greater interference than one that resides beyond the path.
Tipper et al.  provided evidence that, during a reaching task, attention operates within an action-centered representation. They had participants reach and press target buttons that were arranged in a 3 x 3 array in the horizontal (transverse) plane. Below each button were a red and a yellow light. Illumination of the red light indicated that the corresponding button was the target; the yellow light was irrelevant to the task, but it would sometimes be illuminated simultaneously at a different location, serving as a distractor. Tipper et al. examined the cost to the total time (TT) of the reaching movement produced by the distractor, and found that TT suffered more (i.e., there was greater interference) when the distractor fell within the same row as the target or in a row between the hand start position and the target row. Furthermore, when the hand start position was moved to the opposite end of the board (i.e., to the far end of the board), the same pattern of results was found, ruling out a 3-D viewer-centered representation. Tipper et al., in discussing the mechanism underlying the action-centered interference, suggest that motor programs are activated, simultaneously, to both the target and the distractor.
A later experiment by Meegan and Tipper  investigated whether the pattern of interference observed by Tipper et al.  was due to the distractor’s relationship to the response path, as Tipper et al.  had suggested, or to the distractor’s proximity to the start position of the hand. Meegan and Tipper  found that proximity to the hand was a better predictor of distractor interference. This finding does not necessarily undermine the action-centered model; Meegan and Tipper , for instance, suggest that objects nearer to the hand might produce greater response competition than objects farther from the hand, a framework consistent with the parallel response activation proposed by Tipper et al. . However, it is also possible that, because information about the location of the hand is important during action preparation , attention may initially be oriented to it, leading to greater interference from objects in its vicinity.
4.6. The Visual Attention Model and action-perception coupling
The Visual Attention Model (VAM), like premotor theory, posits that motor preparation and perceptual selection are coupled [61,67]. However, VAM differs from premotor theory in two major ways. For one, VAM suggests that the coupling between selection-for-action and selection-for-perception is bi-directional. In other words, selecting an object based on perceptual attributes (e.g. colour) also binds action systems to that object, and selecting an object for action (e.g. preparing to grasp an apple) also binds perceptual systems to that object. (Premotor theory only allows for action preparation to bind perceptual attention to an object.) The other way that VAM differs from premotor theory is that VAM posits an independent, higher-level, attention mechanism that binds action and perceptual processes. (Premotor theory argues against an independent attention system.) Much of the research that has been conducted within the VAM framework does not directly test VAM’s predictions against those of premotor theory. As a result, the research presented in this section – research that demonstrates the coupling between action and perceptual selection – can be taken as support for either VAM or premotor theory.
Deubel and Schneider  provide strong evidence of the coupling between oculomotor preparation and visual attention. In one experiment participants were instructed to make a saccade to a peripheral target based on a central cue (a number specifying the location of the target). After cue presentation, but prior to saccade initiation, a discrimination target (DT) (which was a normal ‘E’ or a reverse ‘E’) appeared either at the same location as the saccade target (ST) or at a different location. The DT was present very briefly, and was masked prior to the onset of the saccade. Participants’ discrimination performance was the dependent measure, and Deubel and Schneider  used this measure to infer the locus of attention. They found that participants’ performance was considerably enhanced when the DT was at the ST position. Performance dropped off considerably when the DT was at a different position than the ST, even if by only 1 or 2 degrees of visual angle. Because all perceptual discrimination occurred prior to any movement of the eyes, these results provide evidence of covert orienting resulting from oculomotor preparation. In another experiment, Deubel and Schneider  showed the same effect when the ST was specified exogenously. Furthermore, in order to control for the possibility that covert orienting might be occurring independently of saccade preparation rather than being driven by it, Deubel and Schneider  conducted an experiment in which participants were told beforehand the upcoming location of the DT. Participants could then try to attend to the DT location while programming a saccade to a different location. Again, however, discrimination performance was best at the ST location, suggesting strong coupling between oculomotor programming and perceptual selection.
Deubel, Schneider, and Paprotta  extended these findings to show that reaching movements have the same impact as oculomotor programming on perceptual selection. The experiment’s design was similar to that of Deubel and Schneider , but with manual aiming movements instead of saccades. A central cue indicated which peripheral object was the aiming target (AT), and the participant’s task was to rapidly aim his/her finger to it while maintaining central fixation. The DT was always presented in the same location, so participants could attempt to attend to the DT while reaching to the AT. Despite participants’ foreknowledge of the DT location, discrimination performance was best when the AT coincided with the DT, suggesting obligatory coupling between reach programming and perceptual selection. A later study by Deubel and Schneider , however, showed that the coupling between reaching and perceptual selection persists only for a short period of time. If movement onset was delayed by more than 300ms after the imperative stimulus, attention could be decoupled from the action target and oriented elsewhere. Eye movements, on the other hand, always bound attention to the saccade target, regardless of delay.
Baldauf, Wolf, and Deubel  replicated Deubel et al.’s  finding that manual preparation orients attention to the aiming target and extended it to show that preparing a multiple component movement can orient attention to multiple targets simultaneously. Participants executed rapid sequential aiming movements to 2 or 3 targets within a circular array of 12 stimuli. Identification of the transiently displayed DT was enhanced when its location coincided with the location of any one of the targets of the movement. Identification of the DT was poor at other locations, even a location falling directly between two target locations. This suggests that action preparation can drive multiple attention ‘spotlights’ in parallel.
A further example of the link between action intention and visual attention was provided by Bekkering and Neggers , who showed that visual target selection was influenced by whether the participant intended to grasp an object or point to it. When participants planned to grasp an object within a field of distractor objects, their initial eye movements, which were used as a marker of attention capture, were drawn less often to distractors of the wrong orientation than when participants intended to point to the target. That is, the intention to grasp may have allowed a pre-filtering of object orientation (a grasp-relevant, but not pointing-relevant, feature), thereby reducing the effect of the distractors on the initial eye movement.
4.7. Section summary
The studies discussed in this section have provided behavioural evidence that both eye and hand movement preparation produce covert orienting of attention. Furthermore, this binding of action and perception appears to be obligatory; even when participants attempt to orient elsewhere, motor preparation carries attention to the action target. So, although high-level decisions about how to interact with the world rely on perceptual representations, once the decision to act has been made, visual perception becomes yoked to action.
We set out to show that a great deal of our daily visual processing is intimately linked with the motor system. Much of that processing, in fact, proceeds without our being aware of it, and it automatically drives our actions. We began the chapter by describing some of the cortical areas that have been shown to provide direct links between incoming visual information and real-time motor output. Investigations of neurological conditions such as visual agnosia and blindsight (impaired visual awareness), optic ataxia (impaired control to peripheral targets), and hemispatial neglect (impaired attention and perceptual representation in one half of visual space) have furthered our understanding of visuomotor control, and many of the findings from these populations are consistent with the idea that visual processing in the PPC is action-related and inaccessible to conscious awareness. Behavioural studies in non-patient participants have also shown that vision-for-action can operate without any awareness on the part of the performer. For instance, masking studies reveal motor responses driven by unperceived stimuli; saccadic suppression studies show automatic responses to unperceived location changes; and motor learning studies show that awareness of a perturbation is not necessary for, and may even be detrimental to, visuomotor adaptation.
Having demonstrated that actions can sometimes access visual information that perception does not, we went on to examine ways in which our actions can also dictate what our perceptual system sees. We discussed the link between eye movements and the pick up of visual information, and we provided evidence that many of our eye movements are directly driven by our plans for manual action. Moreover, visual attention for perception was shown to be bound to the saccade and/or the reach target. At the risk of overstating our case, we propose that one think of action as a tour guide to the gallery of the visual world; it dictates what the perceptual visitors get to see, and it has access to locked rooms that perception never enters.