Stereoscopy and Autostereoscopy

For a seamless Mixed Reality visual experience the display device needs to be versatile enough to enable both 2D as well as 3D Stereoscopic and Autostereoscopic see through information display. The ability to enable single viewer 3D stereoscopic information display is now relatively mature and easier to accomplish but is still a challenge for multiple concurrent users. In addition, the ability to enable virtual reality information display for single viewer is now also relatively mature. However, the ability to enable seamless augmented reality information onto a 3D world is relatively more challenging. It is orders of magnitude, more challenging to have a mixed reality display approach that includes all these capabilities. This chapter will provide a treatise on the stringent requirements for autostereoscopic information display as well as switchable 2D-3D autostereoscopic information displays as a guide for designing better mixed reality displays. It will then conclude by providing an alternative approach for a switchable 2D-3D see through Mixed Reality information display.


Introduction
Binocular stereoscopic depth cues are what underpin the main focus of the 3D stereoscopic and autosterescopic aspect of the chapter. Therefore, a brief introduction into the relevant more stringent requirements for auto stereoscopic 3D theory is critical. The theory introduction entails the relevant physics, psychophysics and mathematical treatise.
To begin, it is self-evident that closing one eye does not immediately render the world completely two dimensional and flat [1]. This is because it is possible to use monocular and oculomotor depth cues in order to judge a scene's depth as in conventional 2D displays. Research shows that combining these cues with binocular stereoscopic cues provides better depth sensations [1,2]. The ability to perceive depth and extracting 3D information from a scene relies significantly on the binocular disparity that results from two eyes each receiving a slightly different perspective of the same 3D scene [1,3,4,6]. The brain then processes this disparity to produce a sense of depth and stereopsis.

Autostereoscopic 3D display theory
The horopter is the set of points that are perceived to be on the same depth level as the fixation point F by the left eye L and right eye R. While on the other hand the Panum's fusion is a range in which all the objects are perceived as fused single images [1,3,4,6,7]. See Figure 1.
The fixation point F projects to the same location on the retina of the left and right eye resulting in no binocular disparity. However, points in front or behind the fixation point F project onto different locations on the retina of the left eye and the retina of the right eye thereby resulting in binocular disparity [1]. The brain then processes this binocular disparity to produce the sensation of stereoscopic depth [1]. Suppose the angle LBR was designated to be b, angle LFR be f, angle LAR be a and angle LCR be c. This then enables the definition of disparity in terms of its angular aspect, which is commonly referred to as angular disparity in display physics [1,3,4,6,7]. The formal definition for angular disparity α is the difference between the vergence angle at the fixation point f and the vergence angle at the desired point.
Thus for point A and B their angular disparities would be: Stereo acuity, usually denoted by the symbol delta δ, is defined as the smallest perceivable change in angular disparity between two objects [1]. In humans the average stereo acuity is considered to be 20 arc seconds [8]. Suppose in Figure 1 point A and C are separated by the lowest limit of distance that their difference in depth can be perceived, then it also means that is also the separation where their angular disparity can just be perceived. Thus, it follows that:

Mathematics of autostereoscopic 3D displays
Earlier research showed that it was possible to produce stereoscopic depth sensations by supplying each eye with a 2D image of the same scene but from slightly different angular perspectives [8]. This slight difference then created the angular disparity which when viewed the brain would process and produce the sensation of corresponding depth based on the given disparity. It has to be noted however that this is in essence an image disparity which produces a retinal disparity which is similar to the natural disparity when viewing a real world scene but it is not identical to the retinal disparity produced by the real world scene [8]. For lenticular lens based glasses-free 3D displays the left and right eye pixel projection's basic configuration is illustrated in Figure 2 below [1].
The optimal viewing distance z can be derived from congruent triangles in Figure 2 as below. Therefore As in parallax barriers, the viewing distance is restricted by the pixel pitch of the underlying 2D display as well as the interocular separation. Also similarly in order to derive the expression for the lenticular lens pitch l, congruent triangles are employed as follows [1,3].
Thus, Showing an illustration of the parameters used in designing a glasses-free 3D lenticular lens display (image credit: [1]).
A glasses-free 3D TV for one viewer as in the above derivations while interesting it is not very practical. However for the purpose of a single wearer mixed reality's 3D display device aspect it does suffice. On the other hand if in the future we were interested in advanced version that enables multiple views or multiple simultaneous users of the same glasses-free 3D display, then slight modifications would have to be incorporated into the design [1].
Using vertical parallax barrier and using vertical lenticular lenses to achieve autosterescopic 3D as above is considered relatively simple. However, there are numerous drawbacks that affect the perceivable 3D image quality from such displays [9][10][11]. Thus, usually slanted parallax barrier or slanted lenticular lenses are employed to lower some of the drawbacks.

Mathematics of slanted lenticular/barrier 3D displays
In current conventional LCD display [5] the pixel is comprised of three subpixels of the three primary colors red, green and blue. Also typically the pixel is roughly a square thus requiring that the three subpixels adopt rectangular shapes. Their sides are approximately one unit in height and one third of the unit length. Each subpixel is then dedicated to a specific 3D view. The view numbers are shown inside each subpixel in the Figure 3 above which is illustrating a seven-view glasses-free lenticular lens 3D display [12]. The subpixels with the same number all belonging to the same view.
This configuration reduces some of the drawbacks of the vertically oriented lenticular lens 3D displays. However, it adds a layer of complexity to the subpixel algorithmic mapping for rendering the 3D image accurately [12,13]. Figure 4 suffices to illustrate the various components of the derivations.
From Figure 4, let P μ be the conventional lenticular lens pitch and α the lenticular lens sheet slant angle. In order to find the view number of any arbitrary subpixel located at an arbitrary point (x, y) on the 2D display plane shown in Figure 4 it requires knowing the offset in the horizontal direction, which is termed the X-off-set of that subpixel as shown in Figure 4. Then from Figure 4 the lenticular lens pitch along the horizontal x-direction is given by [12].
In order to determine the projection of this pitch onto the display from the viewing point as the origin it is necessary to take magnification of the lenticular lens into account. If m is the magnification of the lenticular lens, then it can be expressed in terms of the viewing distance z and the focal length f of the lenticular [12] as follows: Thus, the projection of the horizontal pitch onto the display plane which we shall term the horizontally projected pitch P μ H is given by the following expression.
Therefore the desired X-offset of an arbitrarily positioned pixel at (x, y) then becomes ( [12]) However, if the horizontally projected lenticular lens pitch in Eq. (10) is divided by the pixel pitch P h a particularly important number is obtained. This number is actually the number of views in one row per lenticular lens, which shall be denoted as X as in Eq. (12) below.
As an important and practicality side note, this above expression by itself it does not seem to mean much as it masks some rather important detail. However looking closely at Figure 3 reveals something that becomes apparent only after designing several lenticular lens-based glasses-free 3D displays. That is the number of views (along a horizontal line) per lenticular lens is always half the total number of views Showing the conventional 2D display subpixels behind a slanted lenticular lens sheet configuration of a typical slanted lenticular lens glasses-free 3D display system (image credit: [12]). N total of the whole 3D display. Thus, the above expression can be re-written as the expression.
This unassuming and uncelebrated expression is of great use to any would be slanted lenticular lens 3D display designers. This is because it connects the parameters that are essential to the actual lenticular lens glasses-free 3D display design process. The above expression computes how big should the slanted lenticular lens pitch P μ be if one would like to have a 3D display with a desired number of views N total and at what slant angle alpha of the lenticular lens, given that the LCD pixel pitch is P h . Re-arranging the expression tells pretty much all that is needed as can be seen in Eq. (14).
In practice, usually the client or 3D display manufacturer provides the 3D display designer with the LCD pixel pitch and the number of views needed. The designer then fixes a convenient slant angle, normally 9.4623 degrees. Why 9.4623 degrees one could ask, and why so specific? The reason is Cosine of 9.4623 degrees is $0.99, which for the sake of computation can be approximated to be 1 without loss of generality on the display macro scale in real world practice. Next the 3D display designer will then choose a lenticular lens magnification usually of 0.5. Why 0.5 one could ask? All shall soon be revealed, but in short it suffices to say these are well chosen values. They drastically simplify the design process of the lenticular lens sheet needed to accommodate the client's requirements and produce the wizardry that is high quality glasses-free 3D. In essence this is because mathematically these well-chosen values simplify the expression for the lenticular lens pitch needed. It boils it down to half the total number of the display's views, times a third of the pixel's pitch as in Eq. (15).

Pixel mapping for lenticular Lens 3D display
With regards to the pixel mapping onto the display to enable 3D image rendering, a general expression can be derived. This can be done by starting with a conventional LCD display with pixels arranged in an orthogonal array of red, green and blue subpixels whose coordinates are in the x, y plane of the display. These x, y coordinates can then be expressed in terms of the pixel indices usually denoted as k, l and the horizontal pixel pitch Ph as follows [12,13].
If the expression for X-offset above is divided by the expression for the projected horizontal lenticular pitch and multiplied by N total , then substituting the variables for x and y with their equivalent in terms of the indices, the following expression is obtained for the view number V N for each arbitrary pixel k, l [12,13].
Substituting k, l gives Eq. (20) tells which view number corresponds to each pixel on the display plane and thus enables assigning of the correct 3D image data to the appropriate pixel for correct 3D image rendering. The k offset factor is there to take into account any horizontal shift of the lenticular lens sheet relative to the underlying LCD display.

Enabling 2D/3D switchable display
There are many ways to achieve a switchable 2D/3D information display. This section is centered on how to achieve a low cost dual prism film conversion module that can enable the same pixels to be projected to both eyes of the user (2D mode) when offset by half the prism pitch. As well as separating the different pixels that go to one eye from those that are projected to the other (3D mode) when the prisms from sheets are aligned. The module is simply an assembly of two sheets with vertical prisms on one face and a smooth surface on the other face. The concept was simulated in LightTools 2010 Version 7.1 software by Synopsis (LightTools). See Figures 5 and 6.
The prisms were aligned. Simulations then confirmed the ability of the prism sheets to project left and right eye designated pixels of interlaced images to their respective eyes.
The prisms were then offset. Simulations also confirmed the ability to revert back to 2D display mode. Sample prism sheets were then constructed and tested to verify the concept. The two types polyethylene terephthalate (PET) prism sheets had prism height of 1.732 mm and base of 2 mm and the other sheets were with 0.1732 mm prism height and base of 0.2 mm. The test display's native resolution was VGA but the resulting 3D resolution was half the native resolution. The resulting viewing angle was a very restricted 45 degrees of effective viewing angle. The crosstalk at the viewing distance was 3%. Preferably viewing angle should be high. However, for a single 2D/3D viewer it suffices. While optical prism sheets were employed in this research, lenticular lens sheets would also work the same way using the same principle. Thus produce switchable 2D/3D lenticular lens auto stereoscopic displays [4,6,14].

Combining the concepts into a switchable 2D/3D AR/VR device
Combining the desirable characteristic of the various concepts covered in this chapter could lead to a more versatile switchable 2D/3D AR/VR device as illustrated in Figure 7. In the illustrated configuration in Figure 7, the interlaced data from the  organic light emitting diode (OLED) display can be split into left and right eye pixels. These pixels are then superimposed onto the real world through the translucent eyeglass lenses as desired. This would be more desirable as it have the advantage of requiring only a single small display while still providing binocular stereo and autostereoscopic 3D information display. Of which would reduce cost. The system's ability to switch from 3D mode to 2D mode is another advantage that enables the system to dynamically switch between modes as needed for different applications. The system is also a see through display thus the real world view is not mediated for the user and it is directly merged with synthetic data in a calibrated way using the sensors for tracking the user's head location as well as the user's head orientation similar to other mixed reality systems.

Mixed reality immersion experience discussion
There are many currently available approaches to realizing headset type mixed reality information display just as there are also multiple approaches to realizing unbounded mixed reality information display.
With regards to headset types category they can be divided into subcategories that can be described as fully immersive, optical see through and video see through displays as illustrated in Figure 8.
In general, fully immersive devices tend to be mostly for immersive virtual reality experiences. Their displays tend to be stereoscopic displays that are then combined with sensors that can track the user's head position as well as orientation. Optical components are used to project left and right eye pixels to their respective  Showing some of the different types of head mounted mixed reality device systems (image credit [15]). eye locations depending on the head position and orientation. This projection can be realized either through directly displaying synchronized pairs of images with the desired image disparity to create appropriate feeling of depth sensation using two separate near-to-eye displays. One for each eye, having two displays however tends to also increase the cost. Another way is to use optical components that effectively extracts interlaced left and right pixels from a single display and projects them to their respective eyes.
In video see through type of mixed reality devices, cameras first capture the real world surroundings. Then computer generated data is combined or superimposed on the cameras captured world before being projected into the viewers' eyes in a calibrated way. In-built sensors help with the tracking and the calibration. When done correctly with fewer errors the users can feel the sensation of seemingly unmediated world perspective that just happens to be augmented with synthetic data. However, eliminating all errors and artifacts is a significant challenge.
Optical see through devices mitigate the above camera artifacts' challenge by eliminating it from mediating the viewers' optical path to the real world. Thus, in optical see through devices, the users see the actual real world around them. Then sensors in the devices track the head location and orientation in order to overlay correctly calibrated synthetic data onto this real world. There are multiple approaches to realizing these types of mixed reality devices. The proposed configuration illustrated in Figure 7 is one such binocular autostereoscopic example.
Unbounded mixed reality systems are also a category that enables viewers to experience immersive sensations without necessarily wearing headsets or any other devices. These devices can be for a single user or multiple concurrent users. They are designed so as to provide autostereoscopic information display and sometimes interaction as well. In order to enable multiple simultaneous users to experience autostereoscopic 3D sensation some of the displays employ the concepts covered in Sections 2-2.3. However, all stereoscopic and autostereoscopic information displays make use of the concepts in the human visual system covered in Section 2 as they are the basis for human 3D and depth perception.
Some of the examples of common unbounded mixed reality displays include Walls, Caves and Domes (Figure 9).
Walls mixed reality displays can be comprised of multiple flat panel or curved displays that are tiled together to create an immersive experience. This immersive experience can also be in the form of autostereoscopic sensations using the various multi-views autostereoscopic approaches including the lenticular and parallax barrier systems introduced in this chapter. Another approach used for achieving wall Figure 9.
Showing an illustration of a curved wall mixed reality system with multiple concurrent users (image credit [15]). type mixed reality displays is through projections onto the walls. These could be front projections, rear projections or both depending on the application.
Caves mixed reality systems are in general multi-sided immersive environments that offer notably stronger sensations of immersion than one-sided standing walls mixed reality systems. This sense of immersion is sometimes enhanced with the addition of viewer surrounding autostereoscopic 3D display walls that give the viewers a greater sense of depth. Similar to walls mixed reality systems, flat panel or curved displays can be used as well as rear and front projection displays to produce the cave system.
Domes are a variation of caves mixed reality systems whereby usually the interior hemispherical domed surfaces that completely enclose a space are used as the image projection display surfaces. This configuration thereby creates a seamless 360 degree horizontal and 180 degree vertical immersive experience for the viewers. Coupling these systems with autostereoscopic 3D information display capability results in highly immersive and interactive mixed reality systems that are superior to most. The fact that these strongly immersive experiences can be enjoyed by multiple users simultaneously makes domes particularly popular in multiple industries and research fields.

Conclusion
It is not unusual to encounter mixed reality devices and systems that only can provide 2D information to the viewers in attempts to induce the sense of immersion. Systems that can only do so are limited to certain types of applications and they can perform those functions particularly well. The applications that they perform best are in scenarios where 2D information is the most optimal, for example in certain see through display based real world objects labeling. However, such limited systems might not be ideal for unbounded device applications that would require experiencing immersive autostereoscopic 3D depth. In such applications employing the stringent autostereoscopic 3D information display concepts elaborated in this chapter would greatly enhance the immersive experience. This is because autostereoscopic displays can also display 2D information just as well as 2D displays. While on the other hand 2D mode displays cannot always display autostereoscopic 3D depth information with equal facility, if at all.
Hence, in this chapter the basic treatise of stereoscopic, autostereoscopic as well as switchable 2D/3D information displays were introduced. The chapter then proposed a possible basic configuration for a more versatile switchable 2D/3D Mixed Reality device employing concepts similar to the lenticular prism sheets illustrated in Figure 5. The concepts in this system while they were used to illustrate a head mounted display for a mixed reality device their 3D autostereoscopic concepts plus switchable 2D/3D modes are also applicable to high performance unbounded autostereoscopic 3D mixed reality systems.