Neural Mechanisms for Binocular Depth, Rivalry and Multistability

permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Introduction
The purpose of this chapter is to present a review of recent functional neuroimaging (fMRI) studies of binocular vision, including binocular depth and rivalry, as well as a review of studies of perceptual multistability.As such, we will first emphasize the binocular aspects of binocular rivalry, while later emphasizing the rivalrous aspects.The interrelationship of binocular depth and rivalry, as well as multistability, will be described with reference to fMRI studies and single-unit recording studies in animals.These studies have provided provocative new evidence that the neural substrates for depth and rivalry, as well as other forms of multistability are remarkably similar.We will also describe our own research findings from two recent experiments, in which we performed (1) a direct comparison between binocular rivalry and depth, and (2) a direct comparison between binocular rivalry and monocular rivalry, a related form of bistability [1,2].Our studies are unique in using both matched stimulation and comparable tasks, overcoming a limitation in the interpretation of many previous studies.As a result, these experiments are particularly relevant in delineating some of the global similarities and differences in the cortical networks activated in each of these different domains.

Binocular depth
Binocular depth perception arises as a consequence of the slightly displaced point of view of the two eyes.The horizontal displacement of image features in the two eyes (i.e.binocular disparities) makes it possible to reconstruct the depth relationships in the visual world.Binocular matching of local features in the retinal images may be used to obtain estimates of the absolute disparity (and distance) of objects or surfaces, as well as the relative disparity (or relative distances) between different objects.An example of an image with binocular depth is shown in Figure 1a.If the left and right images are cross-fused, the image appears to be tilted with the top coming forward.This occurs because the visual system interprets the greater shift in matched features at the top of the image as a displacement in front of the fixation plane.In general, crossed disparities, in which image features in the right image are shifted to the right, and image features in the left image are shifted to the left, are interpreted as in front of the fixation plane.Uncrossed disparities (with shifts in the opposite directions) are interpreted as behind the fixation plane.Absolute disparities are the total shift in front or behind the plane of fixation, while relative disparities can be computed as the difference in these absolute disparities for different objects.Thus the initial steps in recovering binocular depth relationships involve determining the horizontal shift in image features between the two eyes.The most prominent cortical areas which have been activated by binocular depth in previous studies have been superimposed on the right hemisphere of the human brain in Figure 2a.The brain areas highlighted in the Figure are occipital visual areas (V1, V2, V3, V4, V3A, V7 and lateral occipital cortex, LO), superior parietal cortex (SP), inferior parietal cortex/intraparietal sulcus (IP), temporoparietal junction (TPJ), ventral temporal cortex (VT), middle frontal gyrus (MF), inferior frontal gyrus (IF), premotor cortex (PM), supplementary motor area (SMA), frontal eye fields (FEF) and insula/frontal operculum (FO).The numeric labels refer to the number of fMRI studies which reported prominent activation at each cortical site (colour coded on a scale ranging from red to blue, from highest to lowest).Table 1 lists the studies which were used to compile these numbers.Note that if it was not absolutely clear that a particular area was reported, this is indicated with an asterisk in Table 1.It should be noted that some studies did not perform a whole brain analysis, or may not have had an interest in reporting activation in certain areas, so this may bias the numbers that appear in Figure 2 and Table 1.Some studies limited their analysis to occipital sites of activation only.Nevertheless, it is clear that the neural mechanisms for binocular depth perception involve a number of processing levels from early visual areas in the occipital lobe to higher-level occipito-parietal and frontal areas [3].An early study of depth restricted the analysis to visual cortical areas, and found that disparity selectivity was present in areas V1, V2, V3, V3A and MT+ [4].However, later studies which performed analysis over a larger number of visual areas found that the activation levels are highest in dorsal occipito-parietal areas, such as V3A, V7, V4d-topo and caudal intraparietal sulcus relative to others [3,[5][6][7][8].Moreover, considering all these studies together, the most consistent sites of activation for depth across many previous fMRI studies were V3A, V7, V4d-topo, or other lateral occipital areas, such as MT+, lateral occipital complex, and kinetic Table 1.fMRI studies which highlighted particular brain areas for depth, rivalry and multistability.
These studies indicate that a number of higher-level retinotopic visual areas are particularly prominent in depth processing, and yet the borders between these areas are sometimes difficult to distinguish across different studies.For example, V4d-topo has been defined as the human topographic homolog (topolog), an area situated (1) superior to V4v, (2) anterior to V3A, and (3) posterior to MT+, and should be distinguished from more ventral lateral occipital cortex [40].V7 is an area adjacent and anterior to V3A that contains a hemifield map [40].Hence V7 and V4d-topo both lie between V3A and area MT+, and the border between these areas is not always clear using conventional retinotopic mapping techniques, or retrospective review.Another region, referred to as the kinetic occipital area (KO), is particularly responsive to disparity edges, and appears to lie within V4d-topo [18].Lateral occipital areas LO-1 and LO-2 also lie within KO [39].

Dorsal and ventral processing streams
The important depth processing areas can be subdivided into distinct dorsal and ventral processing streams, referred to as "what" and "where" pathways, related to the identification of objects, or actions relative to objects, respectively [7,10,14,23].The dorsal stream is believed to project from the occipital visual areas to parietal areas, while the ventral stream projects from occipital to temporal areas [41].More specifically, the ventral stream begins with V1, goes through visual area V2, then through visual area V4, and to the inferior/ventral temporal cortex, which includes lateral occipital cortex, fusiform gyrus and other ventral temporal areas, including the areas mentioned above for depth (Figure 2).The dorsal stream begins with V1, goes through area V2, then V3A, V6, V7 and MT+, and then to the posterior parietal cortex, including the parietal areas mentioned above (superior parietal lobe and intraparietal sulcus) for depth (Figure 2).The anatomical locations of some of these areas are also shown in Figure 3.The black contours superimposed on the top middle panel show locations for areas V1, V2, V3, V4d-topo, V3A, V7, lateral occipital cortex, and MT+ based upon retinotopic mapping and anatomical landmarks for one subject.(This Figure will also be discussed further below, in the discussion of rivalry and multistability).
The dissociation between dorsal and ventral areas has been most clearly delineated in single unit recording studies in the macaque [42].In these studies, ventral areas have been found to have some of the properties necessary for object recognition, such as a detailed 3-D shape description of surface boundaries and surface content.In fact, specific responses are evoked only by binocular stimuli in which depth is perceived, but do not vary if depth is specified by different cues [42].Conversely, dorsal areas (such as the intraparietal sulcus) have some of the properties necessary for making actions, such as selectivity for orientation in depth of surfaces and elongated objects.Moreover, their responses are invariant to changes in depth cues [42].
A dissociation between dorsal and ventral areas in human fMRI studies relates to greater selectivity for object recognition using shape defined by disparity in ventral areas [15,17,[43][44][45][46].One particular study took advantage of the fact that objects are more easily recognized if they lie in front of a background plane, than if they lie behind a plane [17].The stimuli consisted of stereo-defined line drawings of objects that either protruded in front or behind a background plane.The activation in ventral and lateral occipital cortex, or lateral occipital complex (LOC), was greater for the objects which were located in front of the background plane, and the activity in these ventral stream areas was also strongly correlated with behavioral object recognition performance.Several other studies also found that activity in the lateral occipital complex could be related to the representation of shape from disparity by (1) making comparisons between object shapes with or without disparity [43], or (2) by comparisons between object shape conditions in which the 2-D monocular contour did not vary but the perceived 3-D shape differed [44].The lateral occipital complex has also been found to be selective for convex and concave shapes defined by disparity, and is preferentially selective for convex shapes.This fits with behavioral measures since the visual system shows greater sensitivity for the perception of convex shapes [47].A final study found that the lateral occipital complex combines disparity with perspective information to represent perceived three-dimensional shape [15].Several other studies have found dissociations between the properties of dorsal and ventral areas with regards to the representation of disparity magnitude.One study found that when disparity was parametrically varied, the BOLD signal increased with disparity only in dorsal areas of the occipito-parietal cortex (i.e.V2, V3, V3A, as well as inferior and superior parietal lobe) [10].Another study used a comparison of activation for correlated versus anticorrelated random dot stereograms (only the former case supports a depth percept) in order to assess depth selectivity across a number of areas [8].Disparity selectivity was found in dorsal (visual and parietal) areas, including V3A, V7, MT+, intraparietal sulcus and superior parietal lobe, as well as ventral area LO (ventral lateral occipital cortex), but not in early (V1, V2) or intermediate ventral (V3v, V4) visual cortical areas.Furthermore, only dorsal areas were found to encode metric disparity (disparity magnitude), whereas ventral area LO appeared to represent depth position in a categorical manner (i.e., disparity sign).The findings suggest that activity in both dorsal and ventral visual streams reflects binocular depth perception but the neural computations may differ [8].Consistent with these results, a third study measured the responses across a number of occipital and parietal areas to different magnitudes of binocular disparity [7].Across all areas, there was an increase in BOLD signal with increasing disparity.However, the greatest modulation of response was found in dorsal visual and parietal areas, including V3A, MT+, V7, intraparietal sulcus and superior parietal lobe.These differences contrast with the response to the zero disparity plane stimulus, which is greatest in the early visual areas, smaller in the ventral and dorsal visual areas, and absent in parietal areas.These results illustrate that the dorsal stream can reliably represent and discriminate a large range of disparities [7].Moreover, these findings indicate distinct computations performed in (possibly) different cortical areas, including fusional matching, metric depth, and categorical depth.

Posterior parietal cortex
The human parietal cortex is believed to extract three-dimensional shape representations that can support the ability to manipulate objects both physically and mentally (as reviewed in [6]).Lesions to the posterior parietal lobe can cause profound deficits in spatial awareness, including neglect of the contralateral half of visual space, inability to draw simple three-dimensional objects such as a cube, and inability to estimate distance and size [48].The superior and inferior parietal areas of activation identified in human fMRI studies of binocular depth include several intraparietal sulcus (IPS) regions involved in 3D shape perception from disparity, dorsal IPS anterior (DIPSA) and dorsal IPS medial (DIPSM), ventral IPS (VIPS)/V7, and parieto-occipital POIPS [6,9,42].These parietal regions extract 3D shape representations that can support motor functions, such as grasping hand movements or saccadic eye movements toward objects [42].Regions DIPSM and DIPSA have been found to be sensitive to depth structure (i.e., spatial variations in depth along surfaces arising from disparity), but not position in depth, while a more posterior region, the ventral IPS (VIP) had a mixed sensitivity [6].Regions DIPSM and DIPSA likely correspond to LIP and AIP in the monkey and process depth information necessary in order to make eye or hand movements, respectively [6,42].These parietal areas (DIPSA and DIPSM) are also more strongly activated by curved surfaces than tilted surfaces.Hence these parietal areas (DIPSA, DIPSM and VIPS) show a full representation of a range of different 3D shapes from disparity, including frontoparallel, tilted and curved shapes [9].Furthermore, these parietal areas appear to be involved in cue-invariant processing of 3D shape, including processing of monocular cues to depth (e.g., texture gradients, perspective, motion, shading) [6,42].V7/VIPS is also an area sensitive to depth structure, depth position, as well as other cues which contribute to the representation of depth relationships, such as motion, 3D-structure from motion and 2D shape [6,9].In previous fMRI studies, this area has also been described as showing activation strongly correlated with the magnitude of depth defined by disparity and was strongly correlated to the amount of depth perceived by subjects [9].

Questions for future study
The functional neuroimaging studies to date have broadly defined some of the functions of different areas in binocular vision, and delineated dorsal and ventral processing streams.There appears to be a progression in the dorsal pathway from more basic binocular processing in early visual areas, towards the metrical encoding of binocular depth in parietal areas, presumably to support eye or hand movements towards objects.Likewise, the ventral pathway appears to involve a progressive refinement towards depth encoding to support object recognition.However, in either case the processing stages are not understood.Important issues for future study will be to examine this in greater detail and draw stronger inferences in relating the functions of different areas.For example, a few studies have tried to compare the representation of relative and absolute disparity across a number of areas, with conflicting results, although there does appear to be a tendency for relative disparity to be encoded in ventral areas while both absolute and relative disparity are encoded in dorsal areas [3,10,14].The encoding of relative disparity is likely to be very important in object recognition, while both absolute and relative disparity may turn out to be important in perception for action.Future studies could explain more clearly how these different cues may be used in different contexts and with different tasks.Also, relatively few studies have examined the role of stereoscopic cues in complex object recognition.The absence of studies in this area may be related to the belief held by many investigators that binocular disparity is not critical in recognition of faces or other complex objects (for example, see [49,50]).However, ventral areas have selectivity to binocular disparity and hence it would be important to investigate further the role of these areas in binocular vision.

Binocular rivalry
If the images in the two eyes are not the same or similar, but rather incompatibly different, another distinctive perceptual state results.In binocular rivalry, incompatible images, such as left and right oblique oriented gratings, are presented to the two eyes.Observers typically perceive only one image at a time, and perception alternates between the left and right image every few seconds.An example of binocular rivalry is shown in Figure 1b.If the left and right images are cross-fused, alternations may be clearly perceived between the left oblique and right oblique oriented images.Because the retinal image stays constant, while the visual percept changes, this provides a popular method for studying conscious visual experience.Binocular rivalry has been modeled with interocular inhibition between monocular neurons representing the orthogonal left and right image components, as well as neuronal adaptation [51][52][53][54].A monocular neuron may respond to a particular orientation (e.g.left oblique), and suppresses the response of neurons tuned to the opposite (i.e.right oblique) orientation.The neuron will continue to respond until adaptation or fatigue allows the other neurons to respond in turn to the opposite orientation.In the research lab, particularly robust binocular rivalry is created by presenting a number of different types of incompatible images to the two eyes, which may include simple gratings, contours or more complex images such as a face and house or other objects [51] (see also Figure 4(a-b) and 4(ef)).
The interrelationship between binocular depth and rivalry has been a subject of longstanding debate and interest [51,55,56].Generally, binocular rivalry ensues when the image features in the two eyes are too dissimilar to be reconciled, and depth cannot be recovered because the binocular disparities are too great.In our normal visual experience, there may be dissimilar features in the two eyes, but a strong sensation of binocular rivalry occurs only rarely.When unmatched rivalrous components are present in an image, this interferes with normal binocular depth perception [55].Presumably this occurs because suppression from the unmatched components prevents binocular matching necessary for depth.Consistently, it has been proposed that binocular rivalry is the default outcome which arises when binocular fusion and depth fails [51,55].However, recent studies have provided evidence that binocular rivalry and depth can be observed simultaneously over the same spatial location, which calls into question previous models and interpretations [2,[57][58].Moreover, there are a number of possible new interpretations which could reconcile these results.In particular, both binocular rivalry and depth involve mechanisms of binocular matching, to find correspondences between image features for depth or to detect larger, irreconcilable differences, in the case of binocular rivalry.The mechanisms for binocular rivalry may inhibit false matches at different orientations, effectively suppressing noise in neural responses and sharpening the tuning of orientational mechanisms, which can also be related to the phenomenon of dichoptic masking [59,60].Hence the strong inhibitory interactions which we are familiar with in the phenomenon of binocular rivalry may be fundamental in the resolution of ambiguity in binocular vision.Consequently, it has been found in fMRI studies that these processing mechanisms for depth and rivalry are closely related in many brain areas, and occur in parallel throughout the visual system [2,3,5,20,61].Models of binocular rivalry presume that binocular rivalry occurs as a consequence of interocular competition between monocular neurons, which would be expected to occur at early levels in the visual system [51][52][53][54].This has been verified in a number of functional neuroimaging studies of binocular rivalry, which have reported eye-specific dominance and suppression in early visual areas in the occipital cortex (V1, V2, V3) [26,27,62], or the lateral geniculate nucleus (LGN) [63,64].As expected from the theoretical models, a number of fMRI studies have also documented alternating response suppression, as well as neural activation related to attentional monitoring and selection, at later stages in the visual pathway.A number of occipito-parietal areas (e.g.V3A, V7 and intraparietal sulcus) are again represented, as was the case for depth.The most important visual areas reported for rivalry include V1, V2, V3, V3A, V4d-topo, V7 [5], lateral occipital areas (MT+/lateral occipital complex) [1,2,5,19,[21][22][23], and ventral temporal areas [2,[20][21][22][23][24][25].In addition to these visual cortical areas, a number of frontal and parietal sites of activation were reported, which have been associated with top-down control of attention, or stimulus-driven shifts of spatial attention [65][66][67][68].These parietal areas include superior parietal lobe [1,[19][20][21][22], intraparietal sulcus [1,2,5,[19][20][21]23,61], and temporoparietal junction [1,2,[19][20][21][22].These frontal areas associated with attentional control or shifts of attention include middle frontal gyrus (or dorsolateral prefrontal cortex) [1,2,[19][20][21][22], ventrolateral prefrontal cortex and inferior frontal gyrus [1,2,[19][20][21][22], as well as insula/frontal operculum [1,2,[19][20][21][22].Additional areas were reported which could also be associated with attentional shifts or related to the preparation and execution of motor reports, such as supplementary motor area [1,2,20,22], frontal eye fields (FEF) and anterior cingulate [1,2,[19][20][21].The activation of some areas, such as FEF, could possibly be related to eye movements, but some studies used controls to verify that the activation is not related to eye movements, but more likely related to covert shifts of attention [20,29].
Functional neuroimaging studies of early visual areas (V1, V2 or V3) have found that fMRI signal fluctuations during the perception of rivalry are generally lower than the signal fluctuations evoked by actual stimulus changes, in which the stimulus is physically replaced by the alternative [26][27][28].However, much stronger correlations occur between subjective perception in binocular rivalry and activity in higher-level visual areas, such as functionally specialized extrastriate cortex [24].For example, in binocular rivalry in which alternations are perceived between a face and house, signal fluctuations can be discerned in ventral temporal areas selective for faces and houses (i.e.fusiform face area and parahippocampal place area, respectively).The amplitudes of percept-related fMRI signal fluctuations during binocular rivalry in these visual areas are similar to those during actual stimulus alternations, suggesting that the conflict has been resolved at this stage, with no representation of the suppressed stimulus.Hence it appears that there is a progression from early visual areas towards higher-level areas in the magnitude of suppression, with the latter a closer match to the perceptual experience during binocular rivalry [53,69].

Functional interactions between cortical areas in rivalry
One fMRI study of binocular rivalry used analysis methods to detect whether temporal correlations were present in the activity in areas V2/V3 and other cortical areas during the perception of rivalry with no task [21].Indeed, the results confirmed that many of the areas listed above for rivalry were related through a covariation of activity, indicating that these widespread, extrastriate ventral, superior and inferior parietal and prefrontal cortical areas comprise a network reflecting the changes in perception during rivalry.There was significant temporal modulation of activity in these areas that followed closely to the response patterns of human subjects indicating when perceptual alternations occurred.The results indicated that cooperative interactions between extrastriate visual and non-visual areas are important for conscious visual awareness, and that the prefrontal cortex may contribute to conscious vision.Another study of binocular rivalry inferred, using an eventrelated design, that activity in intraparietal sulcus preceded the onset of rivalrous alternations, providing evidence for a possible causal role for this area in initiating rivalry [61].Intriguingly, the intraparietal sulcus was the only area identified in the event-related analysis in this study, and no frontal areas were implicated as playing a causal role in perceptual alternations.

Comparison of binocular depth and rivalry
The cortical areas activated by rivalry in previous fMRI studies are shown in Figure 2b and Table 1, for comparison with depth.Some of the differences between rivalry and depth can simply be accounted for by noting that there was a larger number of studies for depth than rivalry, particularly studies interested in reporting activation levels only in occipital areas.Nevertheless, the Figure makes it clear that parietal activation (i.e.intraparietal sulcus and superior parietal lobe) was prominently reported in both depth and rivalry studies.One exception to this is the temporoparietal junction, which was often reported as an activation site for rivalry but reported for depth only in one study, that actually employed a depth task [2].This is consistent with our view that this area is usually active with stimulus-driven shifts of attention [65,67,70].Overall, frontal activation was more prominent in rivalry than depth studies (such as dorsolateral and ventrolateral prefrontal cortex, middle frontal gyrus, inferior frontal gyrus, supplementary motor area, insula/frontal operculum), while occipital activation was relatively more prominent in depth studies (e.g.V3, V3A, V4d-topo, V7, lateral occipital complex and MT+).Some of these differences could be attributed to the more dynamic aspect of rivalry compared with depth, and the typical performance of a rivalry task, since these frontal and parietal areas have been associated with attention and working memory, as well as the performance of motor reports [65][66][67][68]70].In particular, the anterior cingulate was reported for only rivalry but not depth.
In order to address this issue, we performed a study designed explicitly to perform a direct whole-brain comparison of depth and rivalry with fMRI, with comparable stimulus patterns and tasks [2].We used binocular plaid patterns in which depth is perceived from the nearvertical components and rivalry from the oblique components (Figure 1c).In Figure 1, the depth and rivalry components are added together to produce the plaids in which both depth and rivalry may be perceived.Subjects report that the percept of a rivalrous pattern is spatially superimposed on the tilted surface.The depth in the plaid stimulus changed every 3 s, between two possible percepts (top or bottom tilted forward).This was done in order to make a dynamic depth change that subjects could report, just as they reported dynamic changes in the rivalry task.For the depth task, subjects reported whether the top or bottom of the plaid stimulus pattern appeared to be tilted forward.The time interval of 3 s was chosen to match the mean time period between alternations for rivalry for the group of subjects.The depth change did not interfere with the rivalry percept, and subjects were able to perform a rivalry task with the identical plaid stimulus.This made it possible to compare conditions in which subjects perform either a depth or rivalry report task, while viewing identical plaid patterns, precisely matched for retinal stimulation.A comparison for the depth and rivalry task conditions would reveal the neural substrates for depth and/or rivalry.

Depth and rivalry task comparison
The most important comparison was that between the depth and rivalry task for identical plaid patterns.Our results showed that the whole brain network of activated cortical areas was remarkably similar for the rivalry task compared to the depth task when subjects viewed identical plaid patterns.These areas included the occipital, parietal, ventral temporal and frontal areas highlighted in Figure 2. Nevertheless, regions of superior and inferior parietal cortices (including temporoparietal junction and intraparietal sulcus) were activated more for the depth than the rivalry task, whereas a bias towards rivalry was seen in a lateral occipital region, calcarine, retrosplenial and ventral temporal areas.Thus, these results are important in showing that while parietal areas were clearly strongly activated by either depth or rivalry, consistent with previous studies (as discussed with reference to Figure 2 above), the activation levels were actually higher for depth when the two stimulus conditions had been equalized.This fits with an important role of these parietal areas in depth encoding in order to make hand or eye movements, which has been documented extensively (e.g.[6,9,42]).Conversely, lateral occipital area and ventral temporal areas were more specific for rivalry, consistent with a relatively greater number of studies which showed that these areas may be particularly relevant for the perception of rivalry [5,[20][21][22][23][24][25].
Finally, in another manipulation, we included as a control, an orientation change task, which had similar stimulus features to the depth and rivalry tasks.In this case, the subject had to indicate with a key press which way the image was rotated.The orientation change condition required binocular fusion of matched features but evoked neither depth nor rivalry, serving to isolate those stages of binocular combination.This task was also matched to the depth and rivalry tasks in terms of the number of stimulus changes (which occurred every 3 s) and key presses.When the orientation change task was subtracted from either the depth or rivalry task, a lateral occipital area was highlighted, as well as V3A, V7, or ventral intraparietal sulcus (VIPS), and the kinetic occipital area (KO), including LO-1 and LO-2 [18].This result indicated that these are areas active for either depth or rivalry, and may subserve a representation at the surface-level that would facilitate the grouping of features, and allow for more than one feature (i.e.depth or rivalry) to be coded at a spatial location [2].

Conclusions: Comparison of depth and rivalry
In conclusion, the combined results of fMRI and psychophysical studies indicate that depth and rivalry are processed in a similar network of cortical areas and are perceived simultaneously by coexisting in different spatial frequency or orientation channels (see [2,58] for further discussion of the latter point).An important aspect of the results reviewed was that the same frontal and parietal areas were prominently activated for both depth and rivalry.So by matching depth and rivalry for stimulus characteristics and task we found that globally similar sites would be activated, even though depth does not involve overt, endogeneous competition between alternate percepts.We confirmed that all of the prominent sites of activation for rivalry were also present for depth, including frontal (FEF, PM, SMA, MF, IF, IFJ, DLPF, VLPF, FO) and parietal areas (SP, IP and TPJ).These frontoparietal areas have traditionally been implicated in visual tasks requiring spatial shifts of attention and working memory [70].Moreover, functional imaging experiments have shown that the superior parietal cortex is also engaged by successive shifts of spatial attention [71].

Multistability
Binocular rivalry is a specific example of a more general perceptual experience, multistability.Multistable images comprise important examples of conscious visual perceptual changes without any change in the stimulus being viewed.Multistability can be induced by using an ambiguous figure with more than one perceptual interpretation such as the Necker cube [72] or Rubin's vase/face [73] (Figure 4).For example, the image in Rubin's vase/face can be interpreted as either a vase or face, and formal observation shows that the perceptual organization changes between the face and vase over time.In a similar way, the Necker cube can be perceived with one face coming forward, or the other face forward and the percept fluctuates over time between these two possible organizations.As in the case of binocular rivalry, the retinal image stays constant, while the conscious percept changes.This lends itself to an investigation of visual conscious perception without a confounding stimulus change, as we have already seen for binocular rivalry.However, in comparison with binocular rivalry, observers do have somewhat greater voluntary control over their perception in these examples of multistability, and are better able to bias their interpretation towards one percept or the other [74].Other examples of multistability include the rotating structure-from-motion sphere, which can be perceived to rotate in two different directions [36,38], and the apparent motion quartet, in which the perceived motion alternates between two different directions [19,33,75].Another example of apparent motion is the spinning wheel, in which the perceived direction of rotation alternates between two directions [30].Monocular (pattern) rivalry is yet another example of multistability in which a composite image is shown to both eyes, such as the sum of orthogonal gratings (Figure 4c) or a face/house composite (Figure 4g) [76].These examples of monocular rivalry can be compared with examples of binocular rivalry in which either gratings or face/house pairs are shown to the left and right eyes (Figure 4a-b and e-f).Binocular rivalry can be perceived if (a-b) or (e-f) are cross-fused.Binocular rivalry can also be perceived for (c) and (g) if these are viewed using red-green stereoglasses.In monocular rivalry, the observer experiences perceptual alternations in which the two stimulus components (e.g.left and right oriented gratings) alternate in clarity or salience.The experience is similar to perceptual alternations in binocular rivalry, although the alternations are more difficult to perceive, because neither component is completely suppressed [69,77].Thus in all these examples of multistability, the alternations between the different possible percepts are more subtle, compared with the near total suppression of one eye's image which occurs with binocular rivalry.
The overall global pattern of activation sites for depth, rivalry and multistability can be compared, in Figure 2 and Table 1.Parietal activation (i.e.superior parietal lobe and intraparietal sulcus) was prominently and equally reported in all three cases.However, an exception to this was the temporoparietal junction, which was reported for rivalry and multistability, but not for depth (with the exception of [2]), consistent for a role for this area in stimulus-driven shifts of attention [65,67,70].Furthermore, there was overall more frontal activation (e.g.dorsolateral and ventrolateral prefrontal cortex, supplementary motor area, and insula/frontal operculum) for either rivalry or multistability, compared with depth.These frontal areas could be associated with top-down control of attention or stimulusdriven shifts of attention, as well as the planning and execution of motor responses.Conversely, there was greater emphasis on occipital areas (e.g.V7, V4d-topo, V3, V3A, lateral occipital complex, MT+), for depth compared with rivalry or multistability.In other words, the balance between frontal and occipital activation was in favour of frontal areas for multistability or in favour of occipital areas for depth, with rivalry falling in between.Again, part of these differences can be attributed to the fact that there were more depth studies that had an interest in reporting activation in occipital areas, but even taking this into consideration, the overall pattern shows that occipital activation was relatively more prominent in depth studies.In general, few previous depth studies performed a whole-brain analysis [2,3,5,[7][8][9][10][11][12]16], and of these few, only three reported frontal activation [2,9,10].It was not clear whether this was simply an absence of reporting, or due to the fact that there was no frontal activation because a task was not being performed in most of these studies.The areas reported in these few whole-brain studies were the usual set of prominent occipito-parietal areas we might expect, such as V2, V3, V3A, V4d-topo, V7, intraparietal sulcus and parietal lobe [2,3,5,[7][8][9][10][11][12]16].A useful area for future study would be more matched comparisons between depth and rivalry, in which dynamic changes in depth (and a task to report depth percepts) could be used to make a direct comparison to rivalry studies.

Comparison of monocular and binocular rivalry
As we have encountered before, there appears to be a global trend towards a slightly different distribution of frontal, parietal and occipital activation across binocular rivalry and other multistability studies.Yet, one of the difficulties in comparing results across rivalry and multistability studies is that the studies were not carried out with equivalent stimulus conditions, tasks or methodology, and functional imaging analyses.To address this, we carried out an fMRI study explicitly designed to perform a direct comparison between binocular rivalry and an example of multistability (monocular rivalry), using matched retinal stimulation and comparable tasks [2].We used orthogonal gratings for binocular rivalry (left or right oblique grating in each eye) or monocular rivalry (sum of orthogonal gratings in each eye), as shown in Figure 4. Coloured stimuli were used in order to enhance the percept of monocular rivalry.As described earlier, the perceptual alternations in monocular rivalry are more subtle than those in binocular rivalry, reflecting less perceptual suppression [69,77].
A direct comparison of monocular and binocular rivalry using gratings is attractive as the same images with matched retinal stimulation can be used for both forms of bistability in order to isolate the effect of suppression, and to determine if they share common neural mechanisms.We anticipated that the effects of perceptual suppression would be evident in a lower BOLD signal for binocular compared with monocular rivalry in early visual areas, such as V1, V2 or V3.We also used so-called 'rivalry replay' conditions, in which the entire stimulus was physically changed between the two possible percepts, using the identical temporal sequences reported earlier during rivalry with button presses.This is intended to mimic rivalry in terms of stimulus changes and motor demands, and allows subtractions to be made between rivalry and replay in order to isolate the neural substrates which may be more directly related to the perception of rivalrous alternations.Some results are shown in the form of brain activation maps, averaged across six subjects (Figure 3).The activation for monocular rivalry or binocular rivalry with grating stimuli above the baseline condition is shown at three contrasts (9%, 18%, 36%).A view from the back of the human brain is shown (right hemisphere only).The colour scale indicates statistically significant results ranging from t=2.35 to 8.00 (orange-yellow) (FDR, p<0.05).Compared to a blank screen, both binocular and monocular rivalry show a U-shaped function of activation as a function of stimulus contrast, i.e. higher activity for most areas at 9% and 36%.The sites of cortical activation for monocular rivalry included occipital pole (V1, V2, V3), ventral temporal cortex (including fusiform gyrus), superior parietal cortex, ventrolateral prefrontal cortex, dorsolateral prefrontal cortex, supplementary motor area, frontal eye fields, and insula/frontal operculum.Interestingly, the areas for binocular rivalry were more widespread, and also included lateral occipital regions, as well as inferior parietal cortex, including intraparietal sulcus and temporoparietal junction (TPJ).In particular, MT+, lateral occipital complex and V3A were more active for binocular than monocular rivalry for all contrasts.The comparison of binocular rivalry with the replay condition was particularly important in isolating the neural substrates for the perception of rivalry, and also highlighted these same regions of activation.The more widespread activation pattern for binocular than monocular rivalry may be consistent with the presence of neural competition at higher-level areas, as well as greater effects of attention.As anticipated, when binocular and monocular rivalry were directly compared, an interaction with stimulus contrast was found in early visual areas V1, V2, and V3.Binocular rivalry evoked greater activation than monocular rivalry for the low contrast images.However, at higher stimulus contrasts, where perceptual suppression was more complete, the response to binocular rivalry fell below that to monocular rivalry.

U-shaped function of activation
One of the important results of the study was that both binocular and monocular rivalry showed a U-shaped function of activation as a function of contrast.Current models and concepts regarding binocular rivalry can explain this pattern (e.g.[51][52][53][54]56,69]). Rivalry models include inhibitory neurons in addition to excitatory neurons to account for interocular inhibition and suppression.In addition, the contribution of inhibition and suppression would generally be expected to lower the BOLD signal.At high contrasts, we expect the activation to increase due to an increasing neuronal response gain, which also leads to faster alternation rates, explaining the increase from 18% to 36% contrast.The increase in activation at the lowest contrast can possibly be explained as reflecting disinhibition, assuming that the excitatory and inhibitory neurons have different thresholds.At low contrasts, inhibitory neurons would not be strongly activated, resulting in slower alternation rates.Thus the higher BOLD signal at 9% contrast might be due to a release from inhibition that accompanies slow alternation rates.

Role of parietal areas
An important result of the study was to show that in addition to the activation of visual areas presumed to be involved directly in competition between neural representations, there was also activity for either binocular or monocular rivalry in frontoparietal areas that are often implicated in attention, and previously identified for binocular rivalry [65][66][67]70].The previous literature seems to indicate that the balance of frontal activation may have been slightly higher for multistability than rivalry (as shown in Figure 2).But in our study in which we matched binocular and monocular rivalry for stimulus features and used comparable tasks, the frontal and parietal activation was actually somewhat higher for binocular rivalry, and included temporoparietal junction (TPJ), which was not an area significantly activated for monocular rivalry.The TPJ is modulated by stimulus-driven attentional shifts to unexpected objects or events [65,67,70].It is possible that the TPJ was less active for monocular rivalry since the perceptual changes did not signal a change in object identity, as in binocular rivalry.All the other forms of multistability studied in fMRI paradigms produced some TPJ activation, including ambiguous figures [29,32,35,37], apparent motion [19,30,33] or structure from motion [36].Hence a change in object identity and stimulus-driven shifts to unexpected events may be very relevant to the perceptual experience of binocular rivalry and other forms of multistability.

Visuospatial attention in control of multistability
Two aspects of the deployment of attention in vision have been studied extensively using physiological methods: the effects of attention on modulating neural responses in early visual cortical areas, and the top-down control of attention from executive control regions of the brain [66,68].The effect of visuospatial attention to a stimulus at a peripheral location (while maintaining central fixation) is to increase the cortical response associated with that stimulus in striate and extrastriate visual areas within the contralateral hemisphere compared to when that stimulus is not attended (e.g.[71]).This contralateral attention effect has been shown to operate on the precise retinotopic cortical representation of the attended stimulus.Visual attention can also operate by modulating the cortical responses to a given stimulus feature [71].In contrast, the top-down control of spatial attention has been associated with activity in the dorsolateral prefrontal and posterior parietal cortex, including intraparietal sulcus and superior parietal lobe [68], and transient activity within these regions is thought to initiate a shift of attention between locations, features, or objects.Thus, the effect of attention is to modulate neural activity in visual areas, while the control of attention has been associated with transient activity in frontal and parietal cortex that occurs at the onset of attentional switches [65,70], in addition to sustained activity in these areas that maintains a given attentive state.Studies of the voluntary control of ambiguous figure reversals have also revealed transient frontoparietal activation, suggesting that there may be a common mechanism subserving the voluntary deployment of attention and voluntary control over perceptual bistability [32,34].
One pertinent study investigated whether the voluntary control of perceptual configuration in a multistable stimulus (Necker cube) is mediated by voluntary shifts of selective attention, using event-related functional imaging [32].Two slightly different versions of the Necker cube display were used during attention and perception conditions.In the attention condition, participants were cued to shift attention between the squares in left and right hemifields.In the perception condition, corresponding corners of the squares were connected by horizontal lines producing a perceptually multistable Necker cube.Observers reported which of the two faces appeared forward in depth, and were provided with cues to induce voluntary perceptual reversals.Both the perception and attention conditions yielded increased activity in contralateral occipital visual areas (V1v, V2v, VP, V3, V3A, V4v, MT+, V1d, V2d).Furthermore, voluntary shifts of attention and voluntary shifts in perceptual configuration were associated with common activity in the posterior parietal cortex (superior parietal lobe and intraparietal sulcus), part of the frontoparietal attentional topdown control network [66].These results support the hypothesis that voluntary shifts in perceptual bistability in the Necker cube are mediated by spatial attention [32].

Transitions between percepts in binocular rivalry or multistability
A recent study took a different approach in studying these issues, noting that a number of previous binocular rivalry studies have found a large network of frontal and parietal cortical areas (as in Figure 2) to be active around the time of perceptual transitions between interpretations [19].As described earlier, some previous rivalry studies have used subtractions between rivalry and 'rivalry replay' conditions to isolate rivalry mechanisms, and these frontal and parietal activations were still present following these subtractions [2,20,22].It is possible that this activation could be related to the difficulty in judging the transitions during real rivalry alternations.The investigators noted in particular that some transitions occur virtually instantaneously, with one percept abruptly suppressing the alternative percept, whereas other transitions comprise dynamic mixtures of both percepts for a period of time before one percept dominates completely.They studied the role of this frontoparietal activation, with specific interest in its relation to the temporal structure of transitions, which can be either instantaneous or prolonged by periods during which observers experience a mix of both perceptual interpretations.Using both bistable apparent motion and binocular rivalry, they found that transition-related frontoparietal activity is larger for transitions that last longer, suggesting that the frontoparietal activation remains throughout the duration of the transition.They also found that frontoparietal activity during binocular rivalry transitions exceeded activity during abrupt transitions simulated using rivalry replay, as was found previously in a number of studies [2,20,22].However, they confirmed that this only occurs when perceptual transitions are replayed as instantaneous events.When replay depicts the transitions with the actual durations reported during rivalry, then transitions mimicked with replay and genuine rivalry produced equal activation levels in frontoparietal areas.The results are consistent with the view that at least a component of frontoparietal activation during bistable perception reflects a response to rivalrous (or replay) perceptual transitions rather than their cause.Hence the results shed light on the functional role of frontoparietal activity and the mechanisms underlying perceptual reorganizations during bistable perception.This activation could reflect the change in sensory experience and task demand that occurs during transitions, which fits well with the known role of these areas in attention and decision making [65][66][67]70,78,79].

Methodological issues in fMRI studies and role of frontal areas
Some of the differences in the results across depth, rivalry and multistability studies can be explained due to the use of differing methodology and functional imaging analysis methods.The majority of rivalry studies have used event-related designs which correlated activations in different brain areas to the start of each alternation [5,19,20,22,28,61], while a smaller number of rivalry studies used block designs in which stimulus blocks with rivalry were contrasted with blocks of rivalry replay [1,2,26].One other rivalry study analyzed temporal correlations between cortical areas during passive viewing of rivalry [21].In general, multistability studies have used methods which are quite similar to those used in rivalry studies.For example, a large number of multistability studies used event-related designs correlating brain activation to reversals [29,30,32,33,36], while others used block designs comparing multistability to baseline conditions [31,34,35,37], or multivariate pattern analysis to predict perceptual states [38].
These differences in methodology are obviously related to current concepts of rivalry and multistability as essentially dynamic perceptual phenomena while depth is static, but it should also be acknowledged that these differences could systematically affect the outcome of these studies.In particular, the fact that subjects usually performed a task in rivalry or multistability studies but not depth studies could explain why frontoparietal activation was more likely to be reported for rivalry or multistability.However, this is not the whole story, as several studies have found that frontoparietal activation is present for passive viewing of rivalry (including areas SP, IP, PM, FEF, SMA, MF, IF and FO in Figure 2), even when there is no task [2,21].However, we noted in our own study that although activation in these widespread areas was still present, the absolute levels were lower with no task [2].One multistability study which used passive viewing found that the typical parietal activation was present (superior and inferior parietal areas including TPJ), as well as one frontal site of activation (i.e.premotor cortex), but no significant activation of any other frontal areas, notably there was no significant activation in middle or inferior frontal gyrus [37].Hence, frontoparietal activation is still present when there is no task, but it is reduced.Some other studies of multistability have used tasks involving spatial shifts of attention instead of the more typical motor responses.One particular study of multistability which used spatial shifts of attention between the two possible percepts but no motor reports found activation in parietal areas (SP, IP), and a smaller subset of frontal areas, including only SMA, PM, and MF [35].As described above, a second study of multistability which used spatial shifts of attention between two possible percepts found voluntary shifts of attention associated with activation of essentially the same sets of areas (namely parietal areas, SP, IP) and a small subset of frontal areas, including SMA but no significant activation in MF, IF or prefrontal cortex [32].Hence, the use of tasks involving spatial shifts of attention tends to restrict frontal activation, although the usual site of parietal activation (i.e.SP, IP) are still present.
The use of an event-related design also has an impact on results.Studies of multistability which used block designs reported overall less frontal activation, although parietal activation was consistently reported in these studies [31,34,35,37].This could be because frontal area fluctuations only occur at the onset of alternations and sum to zero over the longer periods.Most binocular rivalry studies used event-related designs, so it is difficult to assess what effect this has on the results.
In comparing the results across the depth, rivalry and multistability studies, a few other trends are apparent.A number of occipital areas, notably V2, V3, V3A, V4d-topo, V7 and MT+, were more frequently reported in depth studies than either rivalry or multistabililty.These areas may not have been frequently reported in studies of multistability because the analysis methods (usually event-related designs) would not find large signal differences in low-level visual representations since the visual appearance of the stimulus barely changes during alternations.For example, multistability studies involving apparent motion usually did not report any activation in these visual areas (with the exception of MT+), likely because of the similarity in stimulus configuration between the two possible percepts [19,30,33].Likewise, there may not have been large signal changes in these areas occurring at the onset of binocular rivalry alternations because the two alternative percepts would not selectively activate any of these areas.Some of the rivalry and multistability studies that did report activation in these areas had one stimulus aspect in common: there was a depth interpretation present in the stimulus alternatives (for example, rotating structure-frommotion sphere, or slant/perspective rivalry, [5,38]).In contrast, ventral temporal areas, including fusiform gyrus, were more likely to be active in studies which used faces as one of the two possible percepts, such as faces/grating stimuli [20,21,24,29].Another stimulus difference which could explain trends is that a number of different examples of multistability which were used had a dynamic aspect (e.g.apparent motion), which was in addition to the multistable percept itself.In general, the frontal activation was greater and included a larger number of areas for the dynamic examples of multistability [19,30,33,36,37,38], compared with static examples [29,31,34,35].

Future research
A number of questions remain unanswered by the existing functional imaging studies on binocular depth, rivalry and multistability.Current models of binocular vision need to be revised in order to explain the interrelationship between depth and rivalry and explain why they are processed in parallel through a number of cortical areas [51][52][53][54]56].It may be possible that the strong inhibitory interactions which we are familiar with in binocular rivalry may serve the purpose of resolving ambiguity in binocular vision.The mechanisms for binocular rivalry may be important in inhibiting false matches at different orientations, suppressing noise in neural responses and sharpening the tuning of orientational mechanisms [59,60].A more general binocular vision model would incorporate these important inhibitory mechanisms, together with binocular matching which is necessary for depth perception.In addition, it is important to incorporate the finding that it is possible to perceive both depth and rivalry simultaneously at a single spatial location.There may be a representation at the surface-level that would facilitate the grouping of binocular depth and rivalry features, and allow for more than one feature to be coded at a spatial location [2].
A common set of frontoparietal cortical brain areas are activated during depth, rivalry or multistability, implying that there is an underlying cortical network with a complex interplay of neural processing between cortical brain areas, which is not yet understood.Such frontoparietal activations could reflect top-down processes that initiate a reorganization of activity in visual cortex during perceptual reversals.Alternatively, as a result of neural activity fluctuations in visual cortex, frontoparietal activations could merely reflect the feed-forward communication of salient neural events from visual cortex to higher-level areas.These two possibilities differ in the causal chain assumed to underlie changes in visual awareness, but it remains difficult to infer causality from correlative neurophysiological measures.Ideally, this would be addressed by probing the causal role of frontal and parietal areas using experimental lesion and microstimulation techniques.For example, a recent study which used transcranial magnetic stimulation (TMS) to create virtual lesions showed that particular frontal cortical areas (e.g.dorsolateral prefrontal cortex) were causally relevant for voluntary control over perceptual switches in a multistable structure-from-motion stimulus [80].Other observations that activations in frontal and parietal areas precede activity associated with the sensory processing of perceptual switches also suggest that feedback signals from frontoparietal areas modulate visual processing [33,81].
However, other results reviewed earlier suggest that the ultimate resolution will be more nuanced and complicated than the dichotomy referred to above (e.g., [19,32,61]).One particularly appealing framework previously proposed suggests that the frontal and parietal areas form part of a sensorimotor continuum and are designed to periodically check or update the current perceptual organization in the visual system [82,83].Hence this central control network would mediate between alternative perceptions for conscious awareness.This process may in fact occur all the time in natural vision, but would usually proceed unnoticed, resulting in a stable perception of the visual world.In any case, it will be important to carry out further studies in order to clarify the functional role of frontoparietal activity and determine the manner in which it relates to the mechanisms underlying perception in general, and reorganizations during bistable perception.

Conclusions
A review of recent functional neuroimaging studies indicates that binocular depth, rivalry and multistability are three perceptual processing domains which share neural substrates, including largely overlapping occipital, parietal and frontal cortical areas.All three of these perceptual processing modalities can be conceptualized as a series of visual perceptual processing stages in occipital areas, as well as higher-level cognitive functions in parietal and frontal areas, involving decision making, motor planning and execution, attention, awareness and memory.Current research will further study the manner in which these cortical areas interact, and the causal sequence of events which underlies each of these three perceptual processing modalities, recalling some of the most important themes of neuroscience in these overlapping and interrelated functions.

Figure 1 .
Figure 1.(a) Binocular depth and (b) rivalry.(c) Plaids in which both depth and rivalry are perceived.