Summary of all detectors described in this chapter.

## Abstract

Multiple-input multiple-output (MIMO) systems entered most major standards in the past decades, including IEEE 802.11n (Wi-Fi) and long-term evolution (LTE). Moreover, MIMO techniques will be used for 5G by increasing the number of antennas at the base station end. MIMO systems enable spatial multiplexing, which has the potential of increasing the capacity of the communication channel linearly with the minimum of the number of antennas installed at both sides without sacrificing any additional bandwidth or power. To handle the space-division multiplexing (SDM), receivers have to implement new algorithms to exploit the spatial information in order to distinguish the transmitted data streams. This chapter provides an overview of the most well-known and promising MIMO detectors, as well as some unusual-yet-interesting ones. We focus on the description of the different paradigms to highlight the different approaches that have been studied. For each paradigm, we describe the mathematical framework and give the underlying philosophy. When hardware implementations are available in the literature, we provide the results reported and give the according references.

### Keywords

- MIMO systems
- MIMO detectors
- space-division multiplexing
- SDM-MIMO
- linear detection
- interference cancelation
- tree-search

## 1. Introduction

In multiple-input multiple-output (MIMO) communication systems, both the transmitters and receivers are equipped with several antennas which will help in achieving high gains in spectral, power, and energy efficiency compared to conventional single-input single-output (SISO) systems where both the transmitters and receivers have only one antenna each. As a matter of fact, the MIMO systems have the ability to turn multipath propagation and multipath delay spread into a benefit for the receiver. The key advantage of MIMO systems is the many orders of magnitude of the signal-to-noise ratio (SNR) at no extra bandwidth. However, a non-negligible software and hardware processing complexity is added at both sides (transmitter and receiver). Present wireless communication standards including Wi-Fi standards like IEEE 802.11n/ac, long-term evolution (LTE), and WiMAX are considering MIMO technology as a key element. Moreover, in the next generation of wireless technology systems (i.e., 5G), massive MIMO is emerging as a new research field in which base stations are equipped with 100 or more antennas. At the receiver side, designing reliable and energy-efficient MIMO detectors is a very challenging task, because of the complexity of the implementation of the signal detection due to the interfering sub-streams. The signal detection problem refers to finding the most probable transmitted symbols based on the perfect channel state information (CSI) available at the receiver and the received signal.

The hardware implementation of massive MIMO detector is of particular interest to deal with 5G wireless technology. Optimal massive detectors such as the maximum likelihood detector (MLD) or the sphere decoding (SD) are considered infeasible given their high computational complexity. Hence, low computational complexity algorithm achieving near-optimal performance is required; many existing detection algorithms like zero forcing (ZF), minimum mean-square error (MMSE), and successive interference cancelation (SIC) are used to deal with massive MIMO detection. In [1, 2], the authors presented surveys on various MIMO and massive MIMO detection techniques from algorithmic viewpoints. Although many classical massive MIMO detectors have been proposed in the literature, herein, new recent algorithms based on the application of machine learning, geometrical techniques, and bioinspired methods are presented and discussed.

In this chapter, we propose an overview of the SDM detection algorithms. We specifically stress out the different paradigms that are used to solve the detection problem and compare all of them. Thus, we describe the most well-known and promising MIMO detectors, as well as some unusual-yet-interesting ones. Section 2 presents the framework and the assumptions that are used in the remainder section. Section 3 introduces the maximum likelihood (ML) optimal detector, and then Section 4 describes the linear ones. Section 5 details algorithms based on the interference cancelation, and Section 6 discusses the one based on tree-search. Finally, Section 7 highlights unusual-yet-interesting detectors before Section 8 concludes the chapter. Figure 1 provides an overview of all the detectors described in this chapter as a tree mind map.

## 2. Introduction to MIMO detection algorithms

In the SDM framework, data streams are transmitted at the same time and at the same frequency, and the receiver relies on spatial consideration to distinguish the streams. Herein, we assume that the MIMO transmitter does not use any spatial coding and that all data streams are independent. To give the reader a unified mathematical description through this chapter, we adopt the following notation: scalars, vectors, and matrices are denoted by lower-case, bold-face lower-case, and bold-face higher-case letters, respectively. We call *i*th coefficient of the vector *i*th row and *j*th column in the

In the linear input–output MIMO model where data are transmitted as the symbols of a constellation

and the MIMO detection problem then refers to the combinatorial optimization problem:

Assuming a circularly symmetric Gaussian noise, solving Eq. (2) is equivalent to searching the most probable emitted symbol vector based on the signal on each receive antennas and the channel state. Even if

Let us start by outlining the traditional assumptions that we will use in the present chapter. Although many constellation types could be used in MIMO systems, we limit the discussion to the square quadrature amplitude modulations (QAMs) that are most commonly investigated. Besides, the channel is considered memoryless, linear, and flat and with a block fading^{1}. In this chapter, we assume that channel state information (CSI) is correctly estimated at the receiver side but not at the transmitter side. The impact of imperfect CSI at the receiver on the performance of detection algorithms is not addressed in the present chapter. Some known training symbols are sent from the transmitter, based on which the receiver estimates the channel before proceeding to the detection of the transmitted data symbols.

The channel matrix is modeled as a complex matrix *j*th transmit antenna to the *i*th receive antenna. Many channel models can fit in this framework, and we stick to the most popular one: the uncorrelated Rayleigh fading channel [3, 4]. The uncorrelated channel model provides a good approximation of propagating environments with rich scattering where the signals between the transmitter and the receiver experience many different paths and no strong line of sight between the transmitter and the receiver. This situation occurs, for instance, in urban and indoor conditions. In these conditions, each receiver antenna receives a sum of a large number of signal paths, and the channel transfer functions can be modeled as the realization of a circularly symmetric normal distribution.

## 3. Maximum likelihood detector

Obtaining the optimal result requires, in the most straightforward approach, the use of the ML detector that solves Eq. (2) using an exhaustive search. Even if this method gives the best result since all

From a computational theory perspective, the detection problem is an instance of the closest lattice-point search (CLPS) problem with a specified lattice [5]. It has been proved that regardless of the preprocessing on the lattice (i.e., the channel matrix), the problem is always NP-hard [5]. The NP-hardness implies that it is not possible, at the moment, to find any detector that is sure to have both an optimal performance and a polynomial complexity^{2}. For that reason, all the following detectors have suboptimal performance (which can be very close to optimal) or a non-polynomial worst-case complexity (which can be polynomial is the average case).

## 4. Linear detectors

### 4.1 Zero forcing (ZF) detector

Linear detectors are the most simple algorithms to solve the detection problem. The most basic one is the ZF algorithm that follows a two-step process. First, the ZF detector solves Eq. (2) transforming the constraint from

with

highlighting that all the interference are canceled. The previous equation is also the proof that the ZF detector is the optimal linear one regarding the signal-to-interference ratio (SIR) criteria. Indeed, one can see that the vector

### 4.2 Minimum mean-square error (MMSE) detector

By only focusing on the interference, the ZF detector performance suffers from not taking the noise into account. Indeed, if the noise level is known to the receiver, a Bayesian estimator including this information can provide a better detection. A linear Bayesian estimator minimizing the mean-square error can be derived using the orthogonality principle [6] leading to

with

## 5. Interference cancellation detectors

To improve further the performance, it is necessary to drop the linear detector approach and look for more elaborate decoding algorithms. Historically, the first nonlinear detector type is still based on the principle of canceling signal interference. This concept leads to two approaches: an iterative one named successive interference cancelation (SIC) and a simultaneous version named parallel interference cancelation (PIC).

### 5.1 Successive interference cancellation (SIC) detector

The SIC detectors opt for a two-step iterative process: first, a decision is taken on the first position

Even if the performance is better than with the linear detectors, the SIC process is very prone to error, given that the assumption at an iteration has an impact on all the following ones. For this reason, the simple SIC detector has quickly been replaced by a variant seeking for an optimal iteration order [7]. This variant called ordered successive interference cancelation (OSIC) aims to make the first assumption on the position that leads to the better SNR or SINR.

To select the best symbol to detect at each iteration, the OSIC detector computes the post-SNR or post-SINR for each symbol, assuming that the *k*th element is canceled using the detection matrix *k*th post-SNR is computed as in [7, 8].

with *k*th row of *k*^{th} column of *k*th data stream that can be computed, assuming that each symbol is equiprobable, as

with

When using the SINR criterion, the post-SINR expression becomes slightly more complex as the post-processed power of each other channel appears in the expression

For clarity sake, Figure 2 sums up the OSIC detection algorithm introducing a process

### 5.2 Parallel interference cancellation (PIC) detector

The main drawback of the OSIC algorithm is that the number of iterations grows linearly with the number of antennas. The number of stage becomes an issue for large MIMO system since each stage adds a reception delay. For that reason, a detector capable of canceling the interference for all antennas at once was developed.

The first application of such an algorithm to SDM systems dates from the early 2000s, and it is based on a few basic steps summed up in Figure 3 as published in [9]. The main point of the PIC algorithm is to start by using a simple detector with poor performance, most of the time a linear one, and cancel the interference on all antennas at once based on the assumption. If better performance is required, it is possible to iterate the last three instructions as many times as needed by using the new detected symbol as the new assumption.

Simultaneously, the iterative reception techniques developed for turbo codes and single-input single-output channels are adapted to MIMO systems. The goal is to receive a coded message by alternating between soft-input soft-output detector and decoder. Each algorithm uses a priori information from the other to improve its performance [10]. This method leads to one of the current most accomplished version of the PIC family: a soft-input soft-output detector to be used in iterative decoding with any message coding [11].

This version adds several improvements to the basic algorithm described in Figure 3. First, it uses the soft symbols from [12] that are defined as the expected value of the symbols knowing the a priori. The reliability of a soft symbol is computed as its variance. Then, the parallel cancelation (see the third instruction in Figure 3) is performed using the soft symbols in place of the rough estimation. Finally, a last MMSE filtering is performed before the computation of the log-likelihood ratios (LLRs). Further reductions in complexity are also used, such as the max-log approximation [10, 13, 14] or the channel Gram matrix [15]. Thanks to all of these improvements, an application-specific integrated circuit (ASIC) is reported to achieve a throughput greater than 750 Mb/s with good BER performance [11].

### 5.3 Selecting between SIC and PIC detectors

The key idea to select between SIC and PIC detectors is to compare the relative quality of data streams. As stated earlier, SIC algorithms guess the best data stream and then process the other one based on this assumption. This process makes the SIC algorithms very prone to error propagation. Indeed, if an assumption is wrong, the error has consequences on all the data streams to be detected. Hence, SIC detectors should be used when there is a net ranking in the quality of each data stream. A basic scenario for this would be a MIMO system receiving data from several users with different channel qualities.

On the contrary, PIC detectors process all the data streams at once so that they are more resilient to interstream error propagation. However, the parallel computations assume that every data stream is as reliable as the other. Due to this assumption, a poor-quality data stream propagates its error to the whole system. For that reason, PIC detectors are well suited when all data streams have the same quality level.

## 6. Tree-search-based detectors

Tree-search-based detectors are the current most investigated algorithms. They use a different framework than the linear and the interference cancelation detector. As the name suggests, the tree-search detector interprets the detection problem from Eq. (2) as the search for the best path in a tree. Tree-search-based detectors can either be optimal with a non-polynomial yet small complexity or quasi-optimal yet not optimal with a polynomial convexity.

Figure 4 gives an example of the tree interpretation for a constellation with four symbols and two data streams. In this configuration, solving the detection problem is equivalent to find two symbols in the set

In this paradigm, the exhaustive search detector described in Section 3 computes the objective function for each leaf node and then selects the best path. Tree-search-based detectors search for the best leaf without trying every path. This leads to three enumeration paradigms: depth-first, breadth-first, and best-first. These paradigms will be detailed after the description of the preprocessing used by all the variants.

### 6.1 Preprocessing using QR decomposition

All tree-search-based detectors use the same preprocessing. Let be

as unitary matrices act as isometries. Thus, by exploiting the property of unitary matrices

with

The point of this QR preprocessing is that the triangularity of *k*th coefficient of the product

Moreover, the triangularity of

that leads to

With this expression, it is clear that we can compute partial estimations of the objective function and *j*th position as soon as a hypothesis is made on

### 6.2 Depth-first tree-search detection: sphere decoding

The depth-first paradigm is the oldest one, and it is commonly known in the communication field as the sphere decoding (SD). SD is the transposition of the mathematical Fincke-Pohst algorithm in the telecommunication field [16]. The basic principle of this algorithm is to define an upper-bound for the objective function named the radius

This inequality highlights that constraining the objective function may be interpreted as looking for solution no so far from

In the remainder of this section, we assume that

The SD is referred to as a depth-first detector as starting from the root node, it goes as depth as possible until it reaches a leaf or violates Eq. (14). If a leaf is reached, it is compared to the best leaf so far and saved if it is the new best leaf. If Eq. (14) is violated, the SD algorithm backtracks and explores a new path. When all paths are either completed or pruned, the result is the best leaf reached.

The Schnorr-Euchner (SE) enumeration is another depth-first enumeration known to perform better by using a lattice reduction method [17, 18]. The basic idea is to explore the node’s children by the increasing order of their PED. This is particularly useful when using the radius reduction technique that sets an infinite

The SD algorithm and its SE version are optimal as they ensure to find the exact solution of the detection problem. Indeed, the best leaf is obviously the best leaf among all the completed paths, and the pruned paths cannot lead to a better point due to their already worst PED. The NP-hardness argument detailed in Section 3 implies that SD has a non-polynomial worst-case complexity. Moreover, SD expected complexity is also non-polynomial even if the exponential growth is slow enough to compete with polynomial detectors under certain circumstances [19].

A very efficient soft-input soft-output depth-first algorithm is the single tree-search sphere decoding (STS-SD) [20]. To produce its soft-output, it uses the max-log approximation [10, 13, 14] and makes some changes on the pruning criterion. The max-log approximation avoids the computation of the exact LLRs by claiming that

where *i*th bit set to

One of the most advanced ASIC-implemented depth-first reported so far uses a two-dimensional Schnorr-Euchner enumeration. This implementation reaches a throughput higher than 600 Mb/s for the soft-output version and exceeds 1.2 Gb/s for the hard-output one while keeping an excellent energy efficiency [21]. The high throughput is achieved by using several SD cores in parallel to decode several vectors simultaneously.

### 6.3 Breadth-first tree-search detectors: K-best and M-algorithm

Breadth-first detectors drop out of optimality for better implementability. Indeed, they address the two main issues of the depth-first paradigm: the unpredictable complexity that depends on the SNR through

The restricted number of surviving paths induces that the right one can be pruned early if its PED has grown too quickly in the early levels. This is the reason for the optimality loss. Then, some detectors implement a post-detection SNR criterion to reorder the tree levels such that the most certain one is at the top. Thus, the right path is less likely to be pruned by mistake in the early stages. Thanks to this reordering, the K-best algorithm performs very well yet, not optimally in a mathematical sense.

From a hardware implementation perspective, breadth-first tree-search detectors are very efficient. They do not require any backtrack so that an expanded node can be safely deleted as it will never be revisited. Moreover, the number of visited nodes per level is fixed. Thus, ASIC can embed the exact amount of resources required. These two characteristics allow the construction of efficient hardware pipelines that substantially increase the throughput. K-best can achieve at least the same throughput as depth-first without the need for parallel cores. Hard-output implementations exceeding 2.5 Gb/s are reported using the breadth-first paradigm [25]. Another study focused on energy efficiency designed a breadth-first variant that can handle both the channel noise and the hardware noise generated by the voltage over the scaling method in memories [26].

Breadth-first detectors can provide soft-output using the max-log approximation and a list. This list approach is used by several detectors, including some other tree-search algorithm and detectors from other families. The point of this approach is to produce a list

For most breadth-first algorithms,

### 6.4 Best-first tree-search detector: fast descent tree-search and parallel tree-search

Best-first detectors are also sometimes called metric-first. The basic idea besides this enumeration is to not favor depth or breadth over each other. Instead, the node with the best PED is expanded, regardless of its level. The best-first algorithm keeps a node pool with nodes to be expanded. First, the pool is initialized with the root node. Then, at each iteration, the node with the lower PED is popped out from the pool, and all its children are computed and pushed in the pool unless they are leaves. If they are, a comparison with the best leaf so far allows us to keep track of the best result. When a leaf is reached, its objective function may be used as an upper-bound to prune the pool for each node with a PED higher than the new reference. The detection ends when the pool is empty.

This simple method quickly overfills the pool as several nodes are added when only one is popped out. Rather than providing a huge pool to contain all the nodes, improvement is to convert the

Recently, a best-first algorithm ASIC is reported to reach 800 Mb/s in a soft-input soft-output framework [29]. The modified algorithm, called cross-level parallel tree-search, splits the pool node into several pools, one per level. At each iteration, a node from each pool in popped out expanded using the best-child/best-sibling framework, and the new nodes are pushed in the according pool. Moreover, the presented detector prune nodes in each pool using the upper-bound to keep only the one that can improve the result or the counter-hypothesis (see Section 6.2 for details). The slit pool helps the parallelization process so that this algorithm variant is very suitable for hardware implementation.

## 7. Other unusual detectors: bioinspired and geometrical detectors

### 7.1 Deep neural MIMO detection: learning to detect

The rise of deep learning leads to the search for an efficient neural network to solve the detection problem such as DetNet [30]. This network is inspired by the projected gradient descent algorithm that is a major option to solve convex optimization. It is trained for both static channel (

with *k*th layer. Then, the normalization with the ZF result avoids over-penalizing the situation with bad

### 7.2 Bioinspired detectors

The most studied bioinspired decoders fall into two categories: ant colony optimizations (ACO) and particle swarm optimizations (PSO) that include the firefly algorithm (FA). These techniques are often very complex compared to the previously described algorithms, but they claim to be resilient to challenging conditions. Bioinspired algorithms should be able to decode messages with imperfect CSI, or the data streams are correlated.

ACO-based detectors simulate several ants that choose a path randomly to follow with a nonuniform probability function [31]. Each antenna is processed independently. At each iteration, an ant selects the symbol

with

FA-based detectors simulate several fireflies that try to find the best mating partner. The objective function determines the attractiveness of a firefly. Then a firefly goes toward more attractive congeners biased with a random influence to promote exploration [32]. Some FA variant implements a memory effect that makes it even closer to a PSO-based algorithm [33]. This framework is applied to MIMO detection using the QR decomposition described in Section 6.1. The FA represents each symbol to decode as a nest containing as many fireflies as the constellation size. Thus, the fireflies have to select a partner in each nest from the last symbol to the first based on the biased attractiveness. When the firefly population searched all the nests, the best path represents the decoded symbol vector. FA-based detectors can be related to tree-search-based algorithm with a randomness exploration and a fixed number of path allowed.

### 7.3 Geometrical detectors

Geometrical detectors are based on a two-step process: an exploration to find a small set of promising solutions and an exploitation to improve this set at a small cost. It follows the traditional approach in nonconvex optimization to perform simple descents that can be stuck in local optimums from several starting points. Geometrical detectors use a real-valued model and the singular value decomposition (SVD) rather than the QR one [34]. Let us rewrite the objective function by introducing the SVD of

Consequently, the objective function can be rewritten as

using

Let

A soft-output version of this algorithm is possible using the max-log approximation and the list approach detailed in [35], Section 5.2. A field-programmable gate array (FPGA) implementation has recently been proposed. This groundwork points out that geometrical detectors may achieve good performance in the future yet being far from mature at that point [36].

## 8. Conclusions and summary

MIMO detection is a well-studied problem that has been tackled from several perspectives. The mathematical interpretation, as a combinatorial optimization problem, leads to the optimal and linear detectors. From the signal processing perspective, detecting a signal means improving the SNR or SINR so that the direct answer is to cancel the interference and to remove the noise. From an algorithmic perspective, the detection problem is the search for the best path in a weighted tree that relies on some well-known algorithms. Other sources of inspiration, such as nature or geometry, provide some interesting perspectives. These paradigms and the associated detectors are summed up in Table 1, and we compare all of them according to the BER-complexity trade-off.

Detector | BER | Complexity | Comment |
---|---|---|---|

ML | Optimal | Dramatically complex | |

ZF | Very poor | Very simple | Best linear detector regarding SNR criterion |

MMSE | Poor | Simple | Best linear detector regarding SINR criterion |

SIC/OSIC | Good | Rather complex | Best when there is a clear ranking in the quality of each data stream |

PIC | Good | Rather complex | Best when all data streams have the same quality level |

Depth-first | Optimal | Very complex | |

Breadth-first | Good | Rather complex | Possible trade-off between BER and complexity via the number of surviving paths |

Best-first | Good | Less complex | |

Deep neural | Good | Rather complex | Possible trade-off between BER and complexity via the number of layers |

Bioinspired | Good | Very complex | Resilient to imperfect CSI and channel correlation |

Geometrical | Rather good | Rather complex | Possible trade-off between BER and complexity via the number of descents |

All these perspectives shed a different light on the problem, leading to fruitful experimentation. Indeed, some methods take inspiration from others to keep on improving. Therefore, some improvement axes remain open, for instance, the permanent decrease of complexity with equal performance, the development for efficient hardware implementations, or the optimization of the interaction with decoders to exploit channel codings better.

## Conflict of interest

The authors declare no conflict of interest but to research and publish in this area, in particular on geometrical detectors.

## Nomenclature

Memoryless | the channel outputs only depend on its state and its inputs. |

Linear | |

Flat | |

Block fading | the channel states vary slow enough to be considered constant over many symbols named coherence block. |

ACO | |

ASIC | application-specific integrated circuit |

CDMA | code-division multiple access |

CLPS | closest lattice-point search |

CSI | channel state information |

FA | firefly algorithm |

FPGA | field-programmable gate array |

LLR | log-likelihood ratio |

MBF | modified best-first |

MBF-FD | modified best-first fast descent |

ML | maximum likelihood |

MMSE | minimum mean-square error |

OSIC | ordered successive interference cancelation |

PED | partial Euclidean distance |

PIC | parallel interference cancelation |

PSO | particle swarm optimization |

QAM | quadrature amplitude modulation |

SD | sphere decoding |

SE | Schnorr-Euchner |

SDM | space-division multiplexing |

SIC | successive interference cancelation |

SIR | signal-to-interference ratio |

SINR | signal-to-noise-plus-interference ratio |

SNR | signal-to-noise ratio |

SVD | singular value decomposition |

ZF | zero forcing |

## Notes

- The nomenclature section provides definitions.
- It is widely believed that it does not exist any solution to solve NP-hard problems in polynomial time. Yet, this assumption is not proven so that there is still a possibility that such an algorithm exists and is just not discovered nowadays.