## 1. Introduction

One of the problems in high speed computing is the limited capabilities of communication links in digital high performance electronic systems. Too slow and too few interconnects between VLSI circuits cause a bottleneck in the communication between processor and memory or, especially in multiprocessor systems, among the processors. Moreover, the problem is getting worse since the increasing integration density of devices like transistors leads to a higher requirement in the number of necessary channels for the off-chip communication. Hence, we are currently in a situation, which is characterized by too few off-chip links and too slow long on-chip lines, what is described as the interconnect crisis in VLSI technology [1]. More than ten years the use of optical interconnects is discussed as an alternative to solve the mentioned problems on interconnect in VLSI technology [2]. A lot of prototypes and demonstrator systems were built to prove the use of optics or optoelectronics for off-chip and on-chip interconnects [3]. The possibilities of current VLSI technology would allow integrating a massively-parallel array processor consisting of a few hundred thousand simple processor elements (PEs) on a chip. Unfortunately it would be a huge problem to arrange several of such PE arrays one after the other in order to realize a highly–parallel superscalar and super-pipelined architecture as well as an efficient coupling to a memory chip. The reason for these difficulties is the not sufficient number of external interconnects to move high data volumes from and to the circuits. In optoelectronic VLSI one tries to solve limitation problem by realizing external interconnects not at the edge of a chip but with arrays of optical detectors and light emitters which send and receive data directly out from the chip area. Honeywell has developed such devices with VCSEL diodes (vertical surface emitting laser diodes) and metal – semiconductor – metal photo-detectors in research project [4].

This allows the realization of stacked 3-D chip architecture in principle. The main problems are not the manufacturing and operating of single devices but the combination of different passive optical elements with active optoelectronic and electronic circuits in one system. This requires sophisticated mounting and alignment techniques which allow low mechanical tolerances and the handling of thermal problems. At present the situation for smart detector circuits is much easier. They can be regarded as a subset of OE-VLSI circuits because they consist only of arrays of photo-detectors with corresponding evaluation circuit for analogue to digital converting. Optical detectors based on PN or PIN photodiodes can be monolithically integrated with digital electronics in silicon what simplifies the design enormously compared with OE-VLSI circuits that in addition contain sender devices realized in GaAs technologies. Furthermore smart detector circuits can be manufactured in nearly every semiconductor fabric. Smart detectors or smart optical sensors show a great application field and market potential. Therefore our approach favors a smart pixel like architecture combining parallel signal detection with parallel signal processing in one circuit. Each pixel has its own PE what guarantees the fastest processing.

The strategic direction of solution of various scientific problems, including the problem of creation of artificial intelligence (AI) systems, human brain simulators, robotics systems, monitoring and control systems, decision-making systems, as well as systems based on artificial neural networks, etc., becomes fast-acting and parallel processing of large2-D arrays of data (up to 1024x1024 and higher) using non-conventional computational systems, corresponding matrix logics (multi-valued, signed-digit, fuzzy logics, continuous, neural-fuzzy and others) and corresponding mathematical apparatus [5-11]. For numerous perspective realizations of optical learning neural networks (NN) with two dimensional structure [5], of recurrent optical NN [6], of the continuous logic equivalency models (CLEM) NN [7-10], the elements of matrix logic are required, and not only of two-valued property, threshold, hybrid but also continuous, neural-fuzzy logics and adequate structure of vector-matrix computational procedures with basic operations of above-mentioned logics. Optic and optoelectronic technologies, methods and principles as well as corresponding element base provide attractive alternative for 2D data processing. These technologies and methods successfully decide problems of parallelism, input-output and interconnections. Advanced non-traditional parallel computing structures and systems, including neural networks, require both parallel processing and parallel information input/output. At the same time there are many new approaches that are based on new logics (neural-fuzzy, multi-valued, continuous etc.). The using of the standard sequential algorithms based on a few operations makes the approaches long-running. But only a few of them [12] can be used for processing of 2D data and perform wide range of needed arithmetic and logic operations). Generalization of scalar two-valued logic on matrix case has led to intensive development of binary images algebra (BIA) [13] and 2D Boolean elements for optic and optoelectronic processors [12-17].

Taking into consideration the above-described approach, consisting in universality, let us recollect some known facts regarding the number of functions. The number of Boolean functions of *n* variables in algebra of two-valued logic (TVL), which is also Boolean algebra, equals*n* variables *k*–valued logic (*k*>2) are reflections*A*={0, 1,... *k*-1}, and the number of functions equals*n* arguments increases with increase of *n* very rapidly [4]:

We would like to draw the attention to the fact, that both natural neurons and their more complex physical and mathematical models suggest discrete-analog and purely analog means for information processing with different level of accuracy, with the possibility of rearrangement of chosen coding system. This, in its turn, requires corresponding image neuron circuit engineering with programmable logic operations, with transition from analog to discrete processing, to storing etc.

Thus, the search of means aimed at construction of elements, especially universal (at least quasi-universal or multifunctional) with programmable tuning, able to perform not only operations of two-valued logic, but other matrix (multi-valued, continuous, neural-fuzzy, etc.) logic operations is very actual problem [15]. One of promising directions of research in this sphere is the application of time-pulse-coded architectures (TPCA) that were considered in works [18-20]. These architectures were generalized in [11], taking into account basic possible approaches as well as system and mathematical requirements. The time-pulse representation of matrix continuous-logic variables by two-level optic signals not only permits to increase functional possibilities (up to universality), stability to noise, stability and decrease requirements regarding alignment and optical system, but also simplify control circuits and adjustment circuits to required function, operation, and keep untouched the whole methodological basis of such universal elements construction, irrespective of valuedness of a logic and type of a logic.

But there is another approach based on the use of universal logic elements with the structure of multiple-input multiple-output (MIMO) and time-pulse coding. We call such elements - the elements of picture type (PT). At increase of number of input operands and valuedness of logic (up to continuous) the number of executable functions also increases by the exponential law. This property allows simplifying operation algorithms of such universal optoelectronic logical elements and hence to raise information processing speed. Most general conceptual approaches to construction of universal picture neural elements and their mathematical rationales were presented in paper [11]. But those were only system and structural solutions that is why they require further development and perfection. Mathematical and other theoretical fundamentals of design of matrix multi-functional logical devices with fast acting programmable tuning were considered in paper[19], where expediency of functional basis unification, that is promising for optoelectronic parallel-pipeline systems (OEPS) with command-flow 2D-page (picture) organization [20], necessity in arrays of optic or optoelectronic triggers (memory elements) of picture type for storage of information and controlling adjusting operands as well as perspective principles of presentation and coding of multi-valued matrix data (spatial, time-pulse and spectral) were shown. Besides, the analysis of various algebra logics [11, 19, 21-24] for functional systems of switching functions, in spite of their diversity allows us to suggest a very useful idea, in our opinion, that lies in following.

It is possible to create more sophisticated problem-oriented processors, in which the specific time-pulse operands encoding and only elements of two-valued logic are used, which will realize functions of different logics, continuous etc. Taking into account the universality, parallel information processing of the universal elements and the use of only two-valued logic elements for implementation of all other operations the approach is a very promising.

That is why the aim of the given work is to consider the results of design and investigation of optoelectronic smart time-pulse coded photocurrent reconfigurable MFLD as basic components for 2D-array logic devices for advanced neural networks and optical computers.

## 2. Design and simulation of two variants of the OPR MFLD base cell

### 2.1. Picture continuous logic elements (PCLE)

Figure 1 shows the structural diagram of picture neural element (PNE) for computation of all basic matrix-continuous-logic (MCL) operations in matrix quasiBoolean algebra **C=((А,В),^, ˇ,-)** [11] for which in any set of MCL arguments matrix continuous logic function (MCLF) **F** takes the value of a subregion of one of the arguments or its supplement. The PE of matrix two-valued logic (MTVL), performing MTVL operations over matrix temporal functions O^{i} _{t}(t) (in point of fact two-valued 2D-operands) realize MCLF over continuous logic variables (CLV) O^{i} _{t}. The time-pulse coding of a grayscale picture is shown in Figure 1. As it is seen in Figure 2 at each point of picture output of PNE, MCL can be performed over continuous logic variables (CLV) O^{1} _{ijT},…O^{n=2} _{ijT}, presented by t^{1} _{ij},...t^{n} _{ij} durations of time pulse signals, during each interval T one of the following operations of CL: min(a,b), max(a,b), mod(a-b),

Thus, becomes obvious that for time – pulse coding realization of PNE of matrix-continuous -logic (MCL) with programmable tuning is necessary UPE of TVL or picture MFLD, by means of which continuously – logic operations over time – pulse signals can be realized. In Figure 1 selection of picture logic functions is carried out by electric adjusting signals and all array cells will realize the same function at the same time. For many appendices it is expedient to choose a logic function at each point of the matrix processor, and therefore there is a desire to make management and tuning also in the form of optical matrix operands. It essentially expands functionality of such processors and MFLD on which basis they are realized.

In work [25] MFLD of two-valued logic (TVL) on current mirrors, photodiodes and LEDs with schemes of their drivers are described and simulated. They are relatively difficult as contain four current mirrors (CM), four schemes ХОR, four elements АND and one logic element OR. In the same work different optoelectronic circuitry were offered on base of 2-4 CM and one photo diode, realizing the Boolean operations AND, NOT, OR, NOR, et al with potential and current outputs. They are based on threshold elements, comparators of currents (photocurrents) on current mirrors and circuitry of limited subtraction (CLS). Such base elements also were used for realization of other elements of continuous logic, including operations equivalence (nonequivalence) and etc. [21, 26, 27]. Therefore developing further this approach we use for design of the OPR MFLD.

### 2.2. Designing of the base cell for the first version of OPR MFLD-1

The function circuit of the OPR MFLD-1 (the first version) is shown in figure. 3, and the circuit diagram of the OPR MFLD-1 on 1.5µm CMOS transistors is shown in figure. 4. It contains 4 optical inputs (the aperture of photodiodes PD) four cells (PD-CM)_{1} ÷ (PD-CM)_{4} executing a role of threshold elements (a threshold -i_{0}) and realizing operation of the limited subtraction(LS):

The cell for the first version of OPR MFLD-1 has a different sub-options, which correspond to different patterns of formation of the thresholds

In Figures 5a, 5b it is shown constructive (a matrix fragment – one OPR MFLD-1) the scheme of base nodes and the most simple optical imaging system for connections. The scheme contains 4 photo diodes, 5+8+5=18 transistors (without transistors of drivers) and the scheme is enough simple. By changing optical (or electrical) signals of tuning vector у1÷у4 at input 4 photodiodes signals from light emitter diodes LED and

Signals from the first input A and from the second input B (a variant of output II) together with tuning vector у1÷у4 will be transformed to a total photocurrent. Base elements of limited subtraction (LS) based on (PD-CM)_{i} separate out corresponding logic minterms by subtraction of threshold currents

### 2.3. Simulation of the base cell for the first version of OPR MFLD-1

Results of modeling by means of package OrCAD 16.3 of the offered OPR MFLD-1 are shown in Figure 6 for different tuning signals у1_{1}. On the same Figure, the second, third and fourth diagrams show, respectively, the input and output pulse currents of nodes (PD-CM)_{2}÷ (PD-CM)_{4.} Current pulse I35 duration (see the third diagram in Figure 6a), which is at the input A, equals 2μs. Current pulse I39 duration (see the fourth diagram in Figure 6a) equals 7μs at the input B. The output pulse current ID (Q44) of the circuit is shown in the bottom diagram in Figure 6a and its duration equals 8μS (8=10-2). This confirms the correctness work of the circuit.

The diagrams in Figures 6b, 6c, 6d, similar to Figure 6a shows the corresponding input and output currents of the circuit. The difference lies in the different modes for different input pulse durations and the presence of additional power consumption graphics. In Figure 7а dependence of power consumption of OPR MFLD-1 from ^{8}-10^{9} CL-logic operations/sec.

We tested experimentally the circuit for all functions that it can implement. The experiments confirm the implementation of all theoretically possible functions in a wide range of voltages, currents and operating periods of treatment. But given the size limitations of article, here we do not present all results and charts.

If cells of the MFLD-1 with ^{12} CL-logic operations/sec. A modified variant of OPR MFLD-1 in which signals у1¸у4 are realized on current generators with possibility of their programming is also offered. Besides, if the array of cells MFLD-1 realizes the same function it is possible to choose signals with sample corresponding nodes (PD-CM)_{i}. The problem of simplification of the optical system is decided in this case. Because it is necessary to give signals not from three optical apertures, but only from two apertures on the OPR MFLD-1 chip.

### 2.4. Modeling of array of the OPR MFLD-1 with MathCAD

Modeling results of the OPR MFLD-1 with MathCAD which confirm normal functioning of OPR MFLD-1 for all 16 possible functions of binary logic and corresponding functions of continuous logic are shown in figure 8-11. Two inputs 2D operands XA and XB (Figure 8) with dimensional of 32x32 pixels are transformed to XAR and XBR by multiplication of one pixel to 2x2 pixels. Matrixes XAR, XBR have dimensional of 64x64 pixels.

Four matrixes M1÷M4 are formed with formulas shown in Figure 9. These matrixes are used for selection of one subpixel of four pixels of XAR and XBR. Matrixes AXR and BXR are formed after XAR and XBR by elementwise non-equivalence (⊕) operation on matrixes MA and MB. Tuning 2D operand OP is formed by matrixes M1÷M4 and scalar tuning signals oy1÷oy4 or by signals y1÷y4.

Matrix SAB is formed as sum of AXR, BXR and OP. Threshold processing is done over elements of SAB matrix and matrix QSAB is formed:

The threshold value tr =3. Four subpixels are united to one pixel with formula

and output matrix UQSAB dimension is 32x32. Another final threshold processing (t_{0}=1) is done with formula

and output matrix ESAB is formed.

For more detailed consideration fragments AP, BP, OPP, OSP, OQP, QSP with dimensional of 2x2 subpixels or 4x4 pixels from matrixes AXR, BXR, OP, SAB, UQSAB, ESAB are shown in Figure 10. The fragments are shown as matrixes and images. For conventional presentation of the images in MathCAD the matrixes are multiplied by 80. Output of equivalence operation is QSP with dimensional of 2x2, but for OPR MFLD correct operation matrixes QSAP and QABP with dimensional of 4x4 are used.

Examples of other functions realizations with the OPR MFLD-1 as fragments of images are shown in Figures 11

### 2.5. Investigation of the base cell for the second version of OPR MFLD-2

#### 2.5.1. Simulation of OPR MFLD-2 with OrCAD 16.3

The second circuit variant is shown in Figure 12. It differs from the previously discussed first variant that the input optical signals from each of the i,j-th base cell of two picture operands are fed to a photo-detector. One of the picture input using the appropriate shadow mask weakens the signals of one of the operands is a factor of 2. Therefore, the first unit of the circuit consists of current comparators, which convert the output voltages into a digital form that is uniquely appropriate input situation.

With the help of nodes in the current voltage conversion and control signals Y0-Y3 at the output node is formed by the resulting signal as a current, which corresponds to the selected desired logic function. The set of possible logical set of vector signals Y0-Y3 has 16 possible combinations. Selecting one of them allows you to implement any 16 of possible two-valued logic of binary operations. If the input signals are continuous in the time-pulse coded form, selecting the desired operation as a two-valued logic, such as AND, the operation MIN is implemented from time-pulse encoded signals. For the first model experiments in the scheme of an input photo-sensor used two of the current source to set the time of the input time-pulse signals (TPS).Instead of photo detectors are used to control the function of the sources of Y0 ÷ Y3 current. The reference currents are shown as current sources for simplicity. The current sources can be implemented on the same transistors or may be given by means of optical signals with fixed intensity. For the formation of the amplified output current which is required for light emitters, or for driver circuit, you can use the multiplier current at the current mirror.

The simulation results of this scheme on 65nm CMOS transistors with OrCAD 16.3 PSpice, at different voltages and power levels of input signals are shown in fig. 13 -20.

Experiments have shown that the power consumption of a cell does not exceed 200÷300µW, delay times and pulse fronts are less than 1 nanosecond, and the basic cell is realized on 44 (or 36) transistors and 11 current sources on 11÷15 transistors. The duration time of pulse-coded signal is in the range of processing cycles, and the pulse period is 100 nanoseconds. This shows that it is possible to increase the frame processing rate to 10 MHz but at the expense of accuracy and complexity of matching photodetectors with current mirrors. Simulation results with OrCAD16.3 of the same basic cell circuit of the OPR MFLD-2 in the mode of implementation of the functions of the nonequivalence CL or XOR TVL are shown in Fig. 13. Diagrams that explain the work of OPR MFLD-2 in the implementation of functions of the nonequivalence CL or XOR TVL: Id = 5µA, 3V supply voltage, signal durations t^{a}_{pulse} = 50ns, t^{b}_{pulse} = 80ns. In the first diagram above - the output current signal, the second - two input signals and their weighted sum, the down three: the third, fourth and fifth - currents at the output of the threshold units (green solid) and their complements (blue dashed). It uses vector tuning signals **Y**= {Y0, Y1, Y2, Y3} = {0, 1, 1, 0}, and the current level is 5 µA. At the output the correct signal is formed ≈ 30 ns duration. The change of the vector set to {0, 1, 0, 0} allows for the output function I22 * NI23 (where NI23 – the complement of the signal I23), as shown in Figure 14. For credibility, that the function is implemented correctly, we did a change in the duration of signals, such that the first signal t_{pulse}(I22) = 80ns and t_{pulse} (I23) = 50ns (the signals changed their duration). The results showed that there was a signal at the output, which has a duration ≈30 ns.

If change of the vector set to {0, 0, 1, 0} than there is a signal at the output which differs only in the short false pulses. Change of durations of the input signals at the same vector set provides the desired signal at the output (see Figure 15). This confirms the correct operation of the scheme.

In Figure 16 (left) the implementation of the equivalence CL (based on NXOR TVL) is shown. The output signal (the first graph above) has the total duration of 70ns. The operation NOR TVL and on its basis the operation **Y**= {Y0, Y1, Y2, Y3} = {0, 0, 0, 1}, and the right - signals: output, input and intermediate. As can be seen from the simulations, device successfully implements the desired function when changing the supply voltage from 1,5V to 3.3V and in accordance with the results: power consumption

Circuit diagram (Figure 18) of the OPR MFLD-2 with photodiodes is used for simulations with OrCAD16.3 PSpice. The model of the photodiode is the same as in Figure 4. The simulation results are shown in Figure 19. Displaying 4 periods, at each different tuning vector set is applied and different functions is performed: the first period - vector {1,0,0,1} (equivalence), the second period - vector {1,0, 1,0} (inversion of the first variable), the third period - vector {0,1,1,0} (non-equivalence), the fourth period - vector {0,1,0,0} (AND

The signals of these vectors are displayed on the lower four graphs yellow lines. The blue lines show the output currents generated configuration signals and the corresponding nodes. The sum of output currents of these nodes represents the output signal. It was featured on the second chart above the green line and the input photocurrent from the two arguments shows a blue line. At the top graph shows the power consumption of the base cell. The main problem in these cells is a significant deterioration in fronts (an increase of up to 200 ns). Moreover, no change in the operating voltage from 3V to 5V, no change in amplitude of photocurrents (in the experiments, Io = 5μA, 10μA, 15μA, but at 20μA did not work!), including at different levels of reference current generators, practice does not significantly affect the duration of the fronts. It is therefore necessary to look for other circuit solutions, for example, use the cascode circuit of current mirrors, more complex, but high-speed, current or voltage comparators. But at the same time significantly increase the hardware cost of a basic cell, and it does not allow for a high level of integration on a chip. So here we are showing the circuit with extended processing period up to 10μS, which with Io = 5μA circuit will provide the required characteristics. Power consumption does not exceed 300÷350µW at a supply voltage of 2.4V and the 3.0V on photodiodes. Results of experiments are shown in Figure 20. By dynamic reconfiguration of optical signals (vector Y) the desired function of the basic cell is provided and duration of the reconfiguration process is equal to the period T = 10÷100µs. In addition, if use other technologies, the vectors set can be represented using electrical signals.

#### 2.5.2. Simulation of the OPR MFLD-2 with MathCAD

Simulation results of the offered OPR MFLD-2 with MathCAD and it usage for image processing and fuzzy logic operations are shown in fig. 21-24.

Formulas for simulation processing with MathCAD are shown in Figure 21. At first, input two 2D operands **A1** and **B1** and its weighted sum **SIAB** are formed. The coefficient and threshold t_{0}= 10 because the current in the OPR MFLD-2 circuit is 10µA. Contrast complementary images are matrixes **AN1** and **BN1**. After threshold processing by current comparators the direct matrixes **T1SIAB**, **T2SIAB**, **T3SIAB** and matrixes **TN1SIAB**, **TN2SIAB**, **TN3SIAB** of complementary images are formed. Four picture tuning operand **NY0 ÷NY3** are formed with tuning vector signals ny0÷ny3. Four logical members **SY0÷SY3** are formed using simultaneous threshold and state decoding operations. The sum of those members is the output matrix function **NF**. All operands dimension is 64x64 elements. All images of above mentioned matrixes and some output functions are shown in Figure 22.

Simulation results for different functions (AND, EQ, NEQ, OR) implementation in four different sub-regions is shown in Figure 23. **XD** and **YD** are the input matrixes. Tuning matrixes **VY0÷VY3** have different values in sub-regions. Output matrix **VF** is concatenation of sub-region functions.

Let’s demonstrate the possibilities for image processing with such devices. An example of contour extraction (**NF**) when processing the first input operand image A1 and its shifted copy **AES1** as the second operand is shown in figure 24. In figure 24: **NY0, NY1, NY2, NY3** – tuning matrixes for that operation; **NF** – the output image.

## 3. Conclusions

We have developed two version of OPR MFLD which realizes the universal binary logic on optical signals. They have subpixel configuration of 2x2 elements, consist of a small amount of photodiodes (4) and transistors (18), have low power consumption <1-5mW, high productivity and realize the basic set of operations of continuous logic with time pulse representation of processed signals. Modeling of such cells with OrCad is made. It is confirmed that all set of possible functions will be realized with such MFLD by a simple photo tuning. Such cells for OPR MFLD are integrated into array of 32х32 allow reaching productivity 10^{12} CL-logic operations/sec.