HaLoS – Integrated RF-Hardware Components for Ultra-Wideband Localization and Sensing

Ultra-Wideband (UWB) sensors exploit very weak electromagnetic waves within the lower microwave range for sounding the objects or processes of interest. The interaction of electromagnetic waves with matter provides interesting options to gain information from a great deal of different scenarios. To mention only a few, it enables the assessment of the state of building materials and constructions, the investigation of biological tissue, the detection and localization of persons buried by rubble after an earthquake or unauthorized people hidden behind walls, and much more [1]. The advantage of such methods consists in their non-destructive and continuously running measurement procedure which may work at high speed and in contactless fashion.


Introduction
Ultra-Wideband (UWB) sensors exploit very weak electromagnetic waves within the lower microwave range for sounding the objects or processes of interest. The interaction of electromagnetic waves with matter provides interesting options to gain information from a great deal of different scenarios. To mention only a few, it enables the assessment of the state of building materials and constructions, the investigation of biological tissue, the detection and localization of persons buried by rubble after an earthquake or unauthorized people hidden behind walls, and much more [1]. The advantage of such methods consists in their non-destructive and continuously running measurement procedure which may work at high speed and in contactless fashion.
Sensors applying electromagnetic interactions with the test object have been in use for a long time. However, most of such sensors are restricted to a relatively narrow bandwidth and, consequently, they can provide only a small amount of information about the test object. Sophisticated data processing supposed, UWB sensors may be able to provide more information and, therefore, to reduce ambiguities which are inherently part of indirect measurement methods such as electromagnetic sensing.
Depending on the actual tasks, the requirements on the sensing system may be quite different, such as the optimum operational frequency band, measurement speed, sensitivity, system costs, reliability, power consumption etc. There are several UWB sensing principles known, each having specific advantages and disadvantages. Generally, one can state that the usability of UWB-sensors will be largely improved with increasing degree of system integration regardless of the sensor principle. The HaLoS-project addresses this topic by investigating general purpose UWB sub-modules like amplifiers, ADCs, fast processing units etc. as well as an integration-friendly sensor concept based on ultra-wideband pseudonoise codes.
The chapter is organized as follows. First, the most important performance figures of UWB sensors are introduced. Second, we give an overview of various UWB-sensor principles recently in use and explain the UWB pseudo-noise concept. Then, we address some specific topics like wideband receiver circuits, transmitter circuits and high-speed data capture. Finally, some aspects of monolithically integrated UWB-sensors are discussed.

Key figures of UWB-sensors
The UWB sensor configuration may be determined by demands which are guided by two different and partially conflicting aspects. On the one hand, these are the UWB radiation rules, and on the other one, we have to respect the physical constraints of the sensing problem. The radiation rules, which are not unique within different regions of the world, mainly limit spectral power emission, restrict the operation frequency band and require sounding signals of large instantaneous bandwidth. Seen from a physical point of view, we need an adequate operational frequency band which provides reasonable interaction between the sounding signal and the object of interest. This may lead to conflicting situations with the radiation rule for sensing tasks requiring wave penetration like throughwall radar or medical imaging. Thus, one has to search a proper compromise in the case of frequency mask violation. Though UWB sensors are banned from long-range applications due to low-radiation power, they promote biological and medical sensing since the target exposition is harmless. Furthermore, the interaction between sounding wave and target is based on linear phenomena. Hence, the sounding bandwidth may be provided instantaneously (complying with FCC or ECC radiation rules) or sequentially (violating these radiation rules) without affecting the measurement results as long as the scenario under test behaves stationary during the measurement. This paper is focused on techniques for information capture by exploiting electromagnetic interactions. Hence, we do not exclude sensor principles or frequency bands violating UWB radiation rules from our further discussions.

Spectral band and related parameters:
As frequency diversity is a key issue of unambiguous information gathering by electric sensors, the widths and the occupation density of the spectral sounding band is of major interest. For the sake of brevity, we will deal here only with baseband signals (see [2] for deeper discussions) which we characterize by their twosided bandwidth B  that can be linked to typical time domain parameters: 1 1 for pulse shaped signal for CW signal Here, w t represents the width of a pulse, and coh  is the coherence time of a random or pseudo-random signal (i.e. the width of the auto-correlation function). The occupation density of the frequency band is given by the line spacing f  which is either determined by the repetition rate 0 f of a periodic sounding signal ( P t -period duration) or via the Fourier Transform by the observation interval T of non-periodic signals: As non-periodic signals are quite unusual in UWB sensing, we will avoid discussing them.
The line spacing f  gives the frequency resolution of the sensor or it determines the maximum observable length In the case of UWB radar sensing, we can convert (1) and (2) into corresponding spatial parameters. One of them assigns the range resolution r  , i.e. the capability of the radar to separate two close point targets of identical reflectivity. We will refer to the usual relation ( c -wave velocity): even if it should be considered with care. The relation originates from narrowband radar whose sounding signal suffers not from signal deformation neither by reflection at small bodies nor by antenna transmission. In contrast to that, a UWB signal bouncing a point scatterer will sustain a twofold differentiation and further deformations due to the antennas. The unambiguous rage ua r of the UWB radar relates to the signal repetition by:  (6) v -radial speed of a target Equation (5) simply indicates the Nyquist theorem telling us that the refresh rate of the measurement 1 R R T   must be twice the bandwidth of the process to be observed. Relation (6) refers to the Doppler-effect. It is evoked from moving targets causing an expansion or compression of the scattered signal. If such signals are accumulated (by correlation or/and synchronous averaging) over a too long duration, they de-correlate resulting in an amplitude degradation of the receiving signal and finally in the loss of the target. Equation (6) should not be confused with Doppler ambiguity which is not relevant for UWB sensing. Dynamic range: Another group of important features relates to the sensitivity of weak signal detection. For illustration, we consider Fig. 1. The illustration on the left-hand side symbolizes the response of a single target. The red line represents the transmitter pulse (or also the auto-correlation function of a wideband CW-signal), and the black line is the target return which should be the only signal visible on the receiver screen. Obviously, we may detect many signal components hampering the detection of weak targets if there are some. These perturbing signals are random noise (electronic and quantization noise) and device internal clutter. It depends on the receiving signal and may be caused by the linear (internal mismatch, cross-coupling, frequency-dependent transmission behavior of electronic compounds) and non-linear effects (e.g. device saturation). Fig. 1 (right) depicts typical dependences of the perturbations from the level of the receiving signal. Based on this, we can derive various dynamic ranges:  Clutter-free dynamic range cl D : It refers to the level difference between receiving signal and the strongest internal clutter peak. cl D determines the sensitivity to detect weak targets in the presence of a strong one, and also the strength of artefacts in radar images.  Optimum dynamic range opt D : Internal clutter caused by linear effects can be removed by sensor calibration as used in network analyzer measurements. A perfect calibration supposed, the erroneous signals are curtailed now by the noise floor and non-linear distortions. Hence, we get optimum conditions for a large dynamic range at the interception of noise and third-order distortion lines.  Maximum dynamic range max D : It is defined by the difference between 1 dB compression point and noise level. Its value gives a hint on the sensitivity to detect moving targets of weak reflectivity. In many cases, the strongest backscatter signals are caused by static objects. As long as the UWB sensor is not moved, these signals and their clutter contributions are stationary so that they may be simply removed by high-pass filtering in observation time. Hence, the detectability of moving targets is only limited by the noise level. The maximum dynamic range can be roughly estimated using the following relation [2]: r  -receiver efficiency; R T -recording time; 1dB V -input voltage at 1 dB compression point (before correlation); k -Boltzmann constant; 0 T -temperature; CF -crest factor; F -noise factor; 0 R -receiver input impedance; B  -receiver bandwidth; ENOB -effective number of bits (before correlation).
The left part of eq. (7) applies performance parameters of analog receivers while the right part deals with the global effective number of bits merging the performance of analog and digital receiver components.
 System performance sys D : It relates the transmitter level to the noise level. Hence, it is given by the maximum dynamic range and the attenuation of the strongest transmission path.

Time and frequency errors:
While above mentioned device characteristics refer to ordinate quantities of a signal representation, the following features quantify the quality of the abscissa representation, i.e. the time or frequency axis. Related to this, we can observe systematic deviations like non-linear frequency or time axis representations resulting in non-equidistant sampling and distortions of frequency-time conversions. Random errors of the time or frequency axis representation, we call jitter or phase noise in the case of short time variation and drift for long term variations. Jitter (respectively phase noise) causes signal-dependent noise which is elevated at signal edges and disappears at flat signal parts.
Jitter limits the performance of super resolution techniques and reduces the sensor sensitivity to detect weak scattering targets in the vicinity of strong reflectors.

Efficiency:
The term efficiency can be seen under different aspects. We will consider three of them here.

Receiver efficiency r
 (see also (7)): The receiver efficiency describes the capability of the receiver to exploit the incident signal energy. As the receiving signals are usually quite weak due to the restrictions of transmission power, one has to attach great importance to the receiver efficiency. It is determined by losses in the receiver front end, e.g. the insertion loss of filters or conversion loss of mixers or sampling gates. However, dead times for energy accumulation due to filter settling, incomplete data capture by reason of sub-sampling or incomplete exploitation of captured date due to serial instead of parallel data processing are much more important. Thus, the efficiency of recent UWB receivers is often reduced to values below 1 ‰ or even less which provides some potential for further improvements. Typical FoM-values for high speed ADCs are to be found at about 10 pJ/conversion. Hence, the power requirement of a 6 bit ADC @ 10 GHz is in the order of 6 W.
The second example relates to an amplifier whose FoM-value is expressed by: The FoM-approach can be extended to further electronic components and numerical algorithms as well. We can conclude two things from FoM-philosophy. Firstly, the designer of an electronic sub-system or algorithm has to achieve a reasonable small FoM-value with his design. Secondly, the designer of the whole system gets some hints on the feasibility of his system conception and the scope of its features if the corresponding FoM-values are known.  [3] should be performed. However, this point will not be considered here as it would go beyond the scope of this chapter.
Without going into detail, we would like to mention at least some further aspects that influence the performance of sensor operation, too. They concern interference issues like robustness against jamming and low probability of intercept (LPIR-low probability of intercept radar).
The performance figures summarized above are the basis for deciding on a certain sensor configuration for a specific application. In what follows, the most popular UWB sensor principles will be tabulated and assessed with respect to the introduced performance figures.

Principles of UWB-sensors
We divide the UWB sensor principles into two groups. While the sensors of the first group generate sounding signals of large instantaneous bandwidth, the devices belonging to the second group deal with narrowband signals swept over a large bandwidth. A thorough analysis of the different sensor concepts of both groups including a reference list can be found in [2]. Here, we will only give a short summary to get an impression of the most common sub-components of UWB sensors and to understand the advantages and disadvantages of the various principles.

Sensors of large instantaneous bandwidth
There are several UWB approaches known exploiting signals of large instantaneous bandwidth. Usually, they are denoted according the sounding signal applied by the sensor. Typical representatives of this signal class are:  sub-nanosecond pulses  very wideband pseudo-noise codes  multi-carrier signals (also assigned as multi-sine), and  white random noise.
By assumption, these signals have a bandwidth in the GHz range requiring often Nyquist rates of the measurement receivers above 10 GHz. Disregarding the device costs, this is hardly to achieve with the limited power budget and the restricted means of data handling (see section 2.1 - Figure of Merit and Data throughput) which a sensor usually has at its disposal. Hence, all these devices must reduce their data rates at the expense of receiver efficiency, which is reflected by a reduced dynamic range max D (see (7)). The data rate reduction is either achieved by sub-sampling or by serializing the data recording. Here, the signal shaper may be a pulse generator, a binary PN-generator or an arbitrary waveform generator. The most often found device implementations apply sub-nanosecond pulse generators. Indeed, the concept allows the implementation of very cost-effective and power saving sensors. However, their system performance often suffers from reduced dynamic range due to the large crest factor of the sounding signal (compare (7)); they do not provide jitter suppression (see also sub-chapter 2.3) and they are not robust against jamming. Wideband PN-generators are an interesting alternative to pulse generators since they provide powerful signals of low magnitude (i.e. of low crest factor). Arbitrary waveform generators are able to provide signals which can flexibly be adapted to the measurement problem. However, they are quite expensive, power hungry and limited with respect to the bandwidth. Hence, they have not been found in practically applicable sensor concepts recently.
Sub-sampling receiver: It is the most often applied UWB concept. It supposes periodic sounding signals ( P t -signal period). Typically, the measurement signals are captured by sequential sampling, providing one data sample per period whose time position is stepwise shifted over the whole signal. The actual sampling interval is P t t   , while the equivalent sampling interval which has to meet the Nyquist criteria is t  . Newer concepts apply interleaved sampling permitting higher sampling rates since more than one point per period is taken. The classical concept of time shift control uses the fast ramp-slow ramp approach which, however, tends to non-linear time axis representation, sampling jitter and time drift. A second method deals with two stable sine wave generators (e.g.
Direct Digital Synthesizers of slightly different frequency ( ). This reduces time drift and avoids time axis non-linearity. However, it still keeps the sampling jitter quite high since the trigger events launching the sounding pulse and activating the sampling gates are based on relative flat edges of the two sine waves of (comparatively low) frequency 1 f and 2 f . Timing control based on digital counters for coarse timing exploits steep trigger edges improving the jitter performance. Then, the fine tuning is typically done by programmable delay chips which consist of hundreds of delay gates. As these gates are not absolutely identical, the delay line cannot ensure equidistant sampling.
Furthermore, the delay time depends on temperature, and the huge number of gates consumes plenty of energy.  Analog correlator: Due to the lag of programmable analog wideband delay lines, one applies two wideband sources (pulse or PN-sequence) providing two identical signals which are shifted in time. The time shift may be controlled by the same approaches as mentioned above. One of these signals stimulates the DUT, and the other one acts as reference in a correlator. Even if the mixer and the integrator do not waste signal energy, the correlator has about the same efficiency as a sequential sampling receiver as long as one does not deal with parallel correlation stages. We can find from eq. (7) that the correlation principle will provide the best dynamic range due to the large time-bandwidth product. But this benefit will be gambled away if sounding signals of large crest factors are applied.
Sub-sampling correlator: Here, we can use also random noise as stimulus. The time lag between measurement and reference signal is performed by shifting the sampling time as explained before. The correlation is done in the numerical domain. The approach is quite time consuming since the averaging time must be high in order to achieve a stable estimation.

Sensors of narrow instantaneous bandwidth
Strictly spoken, such sensors do not belong to UWB systems but they are doing the same job as real UWB devices if they are applied for sensing. Hence, they are worth being considered.   which can be transformed via IFFT into the impulse response function. Simple implementations (e.g. many FMCW-radars) abstain from vector receivers. They only deal with the in-phase component.
Measurement principles applying sine waves provide the best suppression of noise and harmonic distortions due to narrowband filtering before signal capture. Their receiver efficiency tends to one as long as the settlement of resolution filters and signal source are negligible against the recording time. Hence, such devices often suffer from long measurement duration which leads to a strong range-Doppler coupling. The recording time can be reduced either by simultaneous measurements at different frequencies [7] (requiring complex parallel receiver and synthesizer) or by renouncing the narrowband filters (giving up the sensitivity benefits compared to the wideband approaches).

UWB pseudo-Noise Concept
Under the assumption of Pseudo-Noise (PN)-codes for sounding, Nyquist sampling for data capture and embedded pre-processing for data reduction, the principle depicted on the top of Fig. 2 seems to be the most promising if one trades the pros and cons of the various UWB principles with respect to monolithic integration, system performance, MIMO-capability and power consumption. Fig. 4 represents the modified structure adapted to the conditions mentioned above. The use of two receiver channels yields the best performance with respect to different application aspects like synchronous measurement of stimulus and reaction signal, opportunity of device calibration, difference or interferometric measurements as well as long term sensor stability. A stable microwave oscillator controls the whole system. It has to provide only a single frequency c f which allows the use of simple and stable generator concepts. The oscillator pushes a high-speed shift register. Depending on its feedback structure, it provides any binary sequence. Preferentially, M-sequences are used due to their favorable autocorrelation function. Other options could be Golay-codes [8] or Gold-codes if crosscorrelation properties are in the foreground of interest. f . This clock originates from a stable RF-generator and a digital frequency divider which has to run trough all its states before it can launch a new impulse. Hence, any internal deviations between the involved flip-flops have no effect on the divided signal. Therefore, apart from the remaining jitter, we can expect exact equidistant sampling i.e. an absolutely linear time axis representation. 4. The principle of interleaved sampling allows the sampling rate to be varied by keeping the sensor concept. Thus, one can reduce the sampling rate in favor of reduced power consumption and device costs or it can also be increased to improve the receiver efficiency r  depending on the development state of high-speed electronics.
5. Nyquist sampling provides the lowest possible data throughput 1 without violation of sampling theorem.
The embedded pre-processing is mainly aimed at data reduction by synchronous averaging (often the measurement rate is much higher than required by the time variance of the test scenarios), static background removal or signal transformations. It should, however, be respected that impulse compression (in order to get the impulse response) performed at this point will increase the data throughput toward the main processor since the word length of the data samples increases.
The sensor principle depicted in Fig. 4 is basically also able to deal with short subnanosecond pulses. However, this would greatly degrade the performance of the system which is largely determined by the amount of signal energy accumulated in the receiver. In the case of pulse signals, this requires amplifiers of high compression points and high resolution ADCs since the whole signal energy is concentrated in a short moment. Furthermore, the measurement object may be exposed to strong fields in the case of nearfield measurements. The application of PN-codes avoids all these flaws since it carries enough energy even with small signal magnitudes. As the impulse compression (leading to high crest factor signals) is performed in the digital domain, the analog sensor components and test objects are spared from high voltage peaks. It is well known that the impulse compression of time-extended wideband signals largely improves the dynamic range. As shown in [2] (chapter 4.7.3), it also reduces the jitter susceptibility. The impulse compression distributes the jitter power evenly over the whole signal like additive noise. However, a noise increase above the "natural" level of electronic and quantization noise cannot be observed since the jitter-induced perturbations remain quite low due to the measures described above. Hence, the edges of the impulse response of a DUT measured by the PN-principle are not affected by jitter as usually in pulse The simple timing concept of the PN-sensors enables the implementation of large MIMOarrays at which the number of cascaded measurement units is basically not limited. The principle is shown in Fig. 6. However, the data handling will be increasingly demanding with a rising number of channels. In a typical operation mode, the transmitters are sequentially activated while the receivers of all channels work in parallel. Some details of implemented MIMO-systems can be found in chapter 11 ultraMedis. The receiver of the UWB PN-sensor applies sub-sampling for data capture. Hence, its efficiency gives some potential of further improvements. This would, however, be connected with a considerable increase of the sampling rate s f . As we can see from (8), the elevation of the sampling rate has to be done at the expense of the ADC resolution since the FoM-value is primarily fixed by the semi-conductor technology, while the maximum power is limited by the achievable heat transport. However, simply increasing the sampling rate based on low bit ADCs will not bring any profit with respect to the sensor performance, i.e. the opposite will happen.
As, however, the update rate of UWB PN-sensors is much higher than required by the time variance of the test object, the difference between two consecutive measurements is very low so that low resolution ADCs are sufficient for capturing these deviations. Anyway, this supposes a fast control loop and a (less power hungry) DAC of sufficient resolution which provides the captured signals from previous measurements for reference. Some basic considerations related to this type of feedback sampling can be found in [2]. Details of the layout and implementation of related sub-components are discussed in sections 3.4, 5 and 6.2.

Introduction
In order to support system design in its individual stages, different amplifier versions have been developed. Classical ultra-wideband low-noise amplifiers (UWB-LNAs) have been implemented first to ensure early availability and to assess the SiGe BiCMOS technology applied. Then, new receiving components have been considered to address the requirements discovered in system design. This way, a new subtraction amplifier has been made available which allows for practical evaluation of the feedback sampling approach. Fig. 7 illustrates the way in which LNAs and subtraction amplifiers are used as part of the system. The latest subtractor also accounts for low feeding point impedance, which is imperative for the conceptual design of dielectrically scaled antennas. The use of such a device is intended by the collaborative project ultraMedis (see chapter 11). For establishing common interface definitions, the performance of individual components has to be characterized by appropriate metrics. While those are well established for single-ended arrangements, this is not the case with the noise characterization of multiport or differential structures. Hence, a new de-embedding scheme for the noise figure of a differential device has been developed and will be presented.

LNAs for the basic M-Sequence system
Within the basic M-sequence system, low noise amplifiers (LNAs) perform the classical task of adapting the input signal swing to the dynamic range of the analog-to-digital converter (ADC) while adding only a minimum amount of excess noise and providing reasonable power-match conditions. If a high-gain LNA is used, the system also is less sensitive to noise added by succeeding components. Gain, in turn, is limited by the required linearity, and an appropriate compromise with respect to all counteracting requirements has to be found. While trading one parameter against the other, the conditions set by the technology have to be considered for the individual LNA. Resonant tuning and resistive feedback topologies are predominantly used in literature for mapping specifications to circuit designs. Though a resonant solution is favored in [11], the authors do admit that the parasitic base resistance of bipolar transistors causes a large contribution to output noise. Thus, the advantage of extraordinary low noise figures enabled by narrow-band resonant designs as compared to designs matched by resistive feedback is relativized. High magnetic field gradients potentially encountered in some of the applications, and the limited ability to use shielding as identified in [12] also make the use of inductors questionable. Therefore, resistive feedback solutions have been preferred as their use is additionally accompanied by notable die size advantages. While the design of individual amplifiers will be covered in the following subsections, general guidelines can be taken from standard textbooks. In [13], for example, the impact of feedback on noise and impedance match is analyzed in detail. For the design, too, a simplified version of the bipolar transistor small signal equivalent circuit model with additional noise sources as presented in [13] has been used.

Multiple resistive feedback LNA
One of the implemented amplifier versions which have been inspired by classical UWB-LNAs is depicted in Fig. 8. According to [10], this is a popular wideband amplifier topology often referred to as Kukielka amplifier. Due to numerous results reported in literature for this kind of amplifier, the impact of technology on circuit performance can be assessed. For comparison, especially SiGe implementations as presented in [9] are valuable. The main characteristics of this amplifier are set by the core circuit which comprises transistors 1 Q to 3 Q . For analysis, the Darlington pair 2 3 Q Q  , which is used for gain-bandwidth extension of the second stage, is treated as single compound-transistor 23 Q . Within the simplified circuit thus obtained, four feedback loops can be identified. For proper biasing, series-series feedback is applied to 1 Q and 23 Q . By this measure, the bandwidth of both stages is improved. In turn, input and output impedances of the amplifier are increased rather than decreased as required for input and output power matching. Hence, local shunt-shunt feedback is applied to 23 Q in order to reduce the output impedance. Finally, to enable input power matching, global shunt-series feedback from the emitter node of 23 Q to the base of 1 Q is applied. According to [10], this configuration tends to present an overdamped response. For this reason, peaking capacitors 1 P C and 2 P C are inserted to improve the frequency behavior of the amplifier. The addition of peaking capacitors might, however, impair stability which has to be diligently observed during design for this reason as stated in [13]. In the same publication, an approximate calculation of the noise figure (NF) for this topology reveals that the latter is dominated by the noise properties of 1 Q as long as 1 . For this reason, the selection of transistor 1 Q 's bias current to yield optimal current density with respect to its noise properties should be a first step in design. After this initial step, one of the directed design procedures given in [10] or [9], respectively, can be used for further development. Those are derived from input and output power match conditions as well as from pole positions. In order to account for the characteristics of the Darlington pair transistors 2 Q and 3 Q , can be applied according to [10]. In Fig. 8 (left), an emitter follower has been attached for further improving the output power matching in the technology used.
amplifier, on-waver measurements have been performed using a PM 8 probe station of Süss MicroTec (now acquired by Cascade Microtech). Due to the measurement arrangement, losses preceding and succeeding the device under test (DUT), i.e. the amplifier, cannot be avoided. However, their impact on the scattering parameters of the DUT can be eliminated by proper calibration of the network analyzer applied. Also, the spectrum analyzer with noise figure measurement personality in use allows for the specification of losses preceding and succeeding the DUT which are compensated for during measurement in this case 2 . Measurement results obtained in this way are shown in Fig. 9 together with results from post-layout simulation. Initially, they have been presented in [14]. From Fig. 9, peaking in the gain curve progression of the measurement data can be observed as compared to the results from post layout simulation. The maximum difference appears at about 8 GHz, which is the frequency at which a notch in measured 11 S values also appears.
In [14] it is thus suspected that this deviation arises due to the interaction of the test set-up with the DUT. In short summary, the results presented in Fig. 9 for the low-cost technology applied map pretty well the state-of-the-art performances reported in literature at that time.
For a more detailed analysis, the reader may consult [14].

Active Feedback LNA
This amplifier has been inspired by the work presented in [15]. Due to the characteristics of the applied technology, certain adaptations have been required, though. Fig. 10 (left) shows the schematic diagram of the final design. It is a one-stage amplifier with resistive emitter degeneration to ensure a stable DC operating point and to improve bandwidth. Due to the presence of the peaking capacitor P C , degeneration is continuously shifted to higher frequencies. As in the case of the multiple resistive feedback amplifier, this technique has to be used with care to ensure that this measure does not impair amplifier stability. Input matching of 1 Q is achieved by feedback via transistor 2 Q as well as resistors 1 F R and 2 F R . The advantage of 2 Q is twofold: It improves the isolation between the input node and the output node in forward direction and, according to [15], it helps to enlarge the collector-emitter voltage of 1 Q . Thus, the maximum oscillation frequency max f is expected to be increased, and the large-signal behavior is said to be improved. The amplifier according to Fig. 10 was presented in [14] for the first time. Compared to the amplifier in [15], the inductor used for improving the frequency behavior has been abscised from the design while the peaking capacitor has been added. Also, for better output matching, emitter follower 3 Q has been attached. To avoid a lengthy discussion of circuit characteristics, Fig. 10 (left) uses an alternative way to depict the circuit as compared to [14] or [15]. This representation points out the large similarity of feedback paths in both amplifiers shown in Fig. 8 and Fig. 10. Though the actual implementations of the passive feedback networks differ, shunt-series feedback is applied for input matching in both cases, and many design steps can be executed by analogy. Fig. 10 (right) shows the chip photograph of two variants, which have been implemented to assess the impact of layout on circuit performance. In the first version, 90° lead corners are avoided and consecutive 45° lead corners are used instead. By this measure, the average lead length is increased. By contrast, a compact design has been targeted in the second layout version. As discussed in [14], results do not differ significantly as long as the length of the lead connecting the RF input with the first amplifying transistor is kept comparably long. Thus, only results for the second layout version are shown in Fig. 11. (both excluding DC-pads). Measurements have been performed the same way as explained for the multiple resistive feedback LNA. Compared to Fig. 9, gain is much lower, which is expected due to the single stage nature of the active feedback amplifier. At the same time, the input referred 1 dB compression point is improved notably. A more complete discussion of amplifier characteristics is presented in [14].

Pseudo-differential LNA
Core of the half-circuit shown in Fig. 12 (left) is the cascode amplifier with reactive shunt feedback on the left-hand side. As for the single-ended amplifiers, bias current of this arrangement should be selected due to noise and linearity considerations. While a more detailed analysis of this topology, as presented for the inductively degenerated cascade amplifier with capacitive shunt feedback in [16], might be desirable at this point, limited space for this section does not permit a lengthy discussion. Instead, we allude to the amplifier of [17] from which the topology of Fig. 12 (left) has been derived.
A pseudo-differential amplifier has been implemented to support the development of the basic M-sequence system by adding the capability to use differential circuitry, especially differential antennas. Many of its characteristics are inherited from the topology of [17]. However, emphasis with respect to certain design parameters has been shifted. The most peculiar aspect is the fact that the input matching network used in [17] could be abscised from the design. This modification was enabled by improved input-output isolation due to altered feedback tapping points. Some results of the manufactured chip, a photo of which is contained in Fig. 12 (right), are summarized in Fig. 13. Together with additional topological aspects, they are discussed in [18] in more detail. Measurements to gather those results have been performed on-waver at the PM 8 probe station using ground-signal-signal-ground (GSSG) probes. Similar to the single-ended case, scattering parameters could be determined directly by a calibrated network analyzer. By contrast, only single-ended equipment has been available for noise figure measurement, and it is left to the next subsection to discuss a method applicable to (pseudo-)differential amplifiers.

Differential noise figure measurement
Due to space limitations, only a short introduction to this topic will be presented here. While there are alternative methods as presented in [23] and [24], for example, focus will be on a new powerful method proposed in [20]. A convenient definition for the noise figure of a differential (or multiport) device with respect to one of its ports has been given by Randa [22] and is reprinted in (11) with slight modifications: The noise figure in (11) is parameterized by the matrix of reflection coefficients seen by the DUT into the ports of connected components k Γ and the noise correlation matrix of incident noise waves injected by an external source k C . I is the identity matrix in (11), the dagger indicates the Hermitian conjugate, and k S as well as , S a C are the scattering matrix and the noise correlation matrix of emergent waves contributed by the DUT, respectively. To apply this definition, , S a C has to be determined first. Therefore, the differential device has to be embedded into a network of passive components which provide the differential excitation as only single-ended measurement equipment is currently available. This is demonstrated in Fig. 14. The noise correlation matrix of the DUT then has to be de-embedded from the results measured for the component chain.  For this purpose, the noise distribution matrix defined in [21] is a convenient starting point. Multiplied by Boltzmann's constant k and the physical temperature T , it is the correlation matrix of emergent noise waves caused by a passive component which account for all the noise generated within the device. It can be related to the scattering matrix as shown by (12). A short and intuitive proof of this relation is contained in [25]: Making use of single-ended noise measurement equipment, the characteristic noise equation reference goes heretemperature ˆc hain T , which characterizes noise from all the elements of the whole component chain, can also be determined. Thus, we are only left with the problem to describe signal transfer via the component chain to accomplish the goal of de-embedding , S a C . In [20], an approach based on the connection scattering matrix W has been presented for this purpose, which surpassed the method of [19], because it is applicable without simplifying assumptions. The connection scattering matrix was introduced long ago, and its use for computer-aided circuit analysis has been discussed in [26], for example. For all ports in a component network, it relates the incident to the impressed waves. To enable such matrix representation, incident, reflected, and impressed waves have to be composed to wave vectors a , b , and S b , respectively, which should be sorted in a component-wise way. For convenience, a component index is assigned to the vector entries, and it is assumed that elements corresponding to the DUT (index k ) are placed at the bottom of each wave vector. An element is thus identified by two indices , i j representing the component and the respective port number. With this convention, all complex wave amplitudes can be related by the set of linear equations where S is a block-diagonal matrix assembled from the individual component S-parameter matrices. The connections between the single components impose additional constraints on the wave amplitudes. To account for them, the connection matrix Γ is used as in (14): Most often, a common real reference impedance is applied for all components. Then, all entries of Γ are zero except those which refer to connected ports and, thus, are one. In this case, Γ is a permutation matrix with 1 T   Γ Γ . From (13) and (14) , the incident wave vector a can be eliminated to get Noise generated by the source does not contribute to the characteristic noise temperature of the component chain. The respective entries of S b can thus be set to zero. Furthermore, in the case of 1 S containing zero entries 3 , the corresponding rows of 1  W Γ should be deleted to avoid rank deficient matrix problems in some of the computations. For convenience, we will refer to the matrix obtained from this operation by V . After additional matrix partitioning, which is required later, (15) then becomes In (16), submatrix am V will always be zero, because there is no direct connection from the input to the DUT. Hence, we are left with (17) and (18) after some algebra 4 : In (20), noise correlation matrices of single passive components given by the product of kT and (12) are composed to the block diagonal matrix 5 , S p C . m C accounts for noise from all 4 As the right division function of a math program can be used to compute Q , there is no need for an explicit inversion, and a minimum norm solution is obtained for a non-square system. 5 Note that noise contributions of the output loads, according to [22], should not be considered for NF computation.
With respect to the set-up of Fig. 14  components of the chain related back to the input. For the set-up of Fig. 14, m C is a 1 1  matrix associated with the characteristic noise temperature as shown by (21): For using (11) in noise figure computations, its parameter matrices k Γ and k C still need to be determined. k Γ contains reflection coefficients, which relate waves injected from the DUT to those reflected back from the embedding network a a . It follows from inspection that k Γ is a submatrix of 1  W . Focusing on k C , a simple argument leads to some confidence that (22) is a reasonable choice: Assume that the power splitter at the input of Fig. 14 only excites the differential mode. In this case, noise at the input of the DUT is completely correlated. Also, there should be a noise power of kT available at the input of the DUT in differential mode to stay comparable to the standard noise figure definition. Then, due to the properties of mixed-mode transformation, (22) is the evident solution. This is discussed in [20] in more detail.
Thus, all inputs required for (11) are determined, and the noise figure with respect to a certain output port can be calculated. Instead of a physical port, also a logical port can be considered. For this purpose, matrices in the numerator and the denominator of (11) have to be transformed by an appropriate transformation. In view of Fig. 14, mixed-mode transformation provides noise power spectral densities in the differential mode at the selected output of the DUT for both cases: Noise generated by the DUT and the input sources, as well as noise generated by the input sources alone. Their ratio finally determines the differential noise figure of the device. This approach has been applied to the pseudodifferential amplifier shown in Fig. 15. The result is contained in Fig. 16 together with the noise figure measured from one signal branch, which is given for comparison. In the measurements, the losses of the probe heads have been appropriately taken into account.

Solutions for the feedback-sampling approach
The introduction of the feedback-sampling concept by system design spawned the requirement of signal subtraction at the input of the receive path. Hence, the demand for new components equipped with two inputs arose -one for the RF signal taken from the antenna, and another one for a digital prediction signal provided by the signal processing via a digital-to-analog converter (DAC). In theory, subtraction results in an output signal of highly reduced voltage swing to which the analog-to-digital converter (ADC) used for signal acquisition will be exposed. To confirm this theory in practice, two versions of an input subtractor have been implemented and will be presented in the following subsections.

Pseudo-differential feedback-sampling amplifier
Similar to the design of amplifiers for the basic system, the development of the feedbacksampling amplifier has been guided by the assumption that the receive signal in the RF path is rather week and sufficient amplification has to be provided, while the least amount of excess noise should be added. Hence, the amplifier of Fig. 12 has been reviewed and was deemed to be suited as RF input stage of the new topology. In Fig. 15, it can be identified in the dashed box on the left-hand side. Some adjustments -especially with respect to the values of components in the feedback network -had to be made, though, as the application required a shift in the covered frequency range. At that time, the prediction signal had to be provided by a current steering DAC and can be assumed to be of rather large signal swing. Hence, no amplification is provided for this signal. Promoted by the nature of the prediction signal, the required signal subtraction is performed in the current domain. In Fig. 15, two current mirrors inject their output signals into a common output node for this purpose. While this implies signal addition instead of signal subtraction at first glance, signal addition can be turned into signal subtraction by simple sign inversion, which is enabled either by exploiting the properties of the pseudo-differential amplifier structure itself or by sign selection of the prediction signal in the digital domain. The special current-mirror arrangement of Fig. 15 has been chosen for balancing the maximum output powers which is important to ensure that the signal from the DAC input can cancel the signal from the RF input. Linearity of both signal paths, transconductance from the DAC input to the common output and maximum output currents of the DAC have to be harmonized to account for this requirement. This topology has been characterized in detail. Results for the RF signal path are presented in Fig. 16.
As explained for the pseudo-differential amplifier in section 3.2.3, the PM 8 probe station equipped with GSSG-probes can be used to examine an individual differential signal path. In order to detect the mixed-mode parameters given in Fig. 16, the device has been exposed to true mode excitation provided by the network analyzer, while calibration data have been applied to compensate for losses caused by the test set-up. Noise characterization can be performed by the method presented in section 3.3 and has been discussed in [19] [20]. A remarkable feature of the results of this measurement is the fact that the differential noise figure diff NF does not coincide with the noise figure measured from one signal path 6 SE NF .
Especially at higher frequencies, a large deviation occurs. In [20], this is explained by crosstalk caused by parasitics. So, the use of differential de-embedding schemes is recommended instead of a single-ended noise figure measurement from one signal branch. However, in view of the aim to assess the capabilities of the feedback-sampling approach, the signal subtraction itself is most interesting. In [27], we presented different test set-ups for verification purposes. Fig. 17 shows four representative results which confirm the ability to cancel the RF signal by an appropriate prediction signal. 6 In Fig. 16, NF curve progressions start at 1 GHz because this is the lower corner frequency of the hybrid couplers used for measurement. Those measurements were obtained from a test set-up incorporating the PM 8 probe station with GSSG-probes, in which two signal generators 7 synchronized by a frequency standard provided the input signals to both inputs at appropriate power levels via two hybrid couplers. The differential output signal of the DUT was recombined by a third hybrid coupler and displayed by a signal analyzer 8 . In Fig. 17, no loss compensation is applied and results are clipped to 100 kHz span. Two cases can be distinguished: First, the digital prediction signal has been switched off (DACoff) and only the RF signal has been present at the inputs. Then, also the prediction signal has been applied (DACon) and a notable reduction in output signal power can be observed for all frequencies.

Subtractor with Low Impedance antenna interface
The feedback-sampling amplifier of the preceding subsection is expected to perform well as long as the assumption of (reasonable) small input signals is justified. A key requirement in the feedback-sampling concept is linearity preceding the signal subtraction in order not to distort the zero crossings which are sampled by the analog-to-digital converter. However, as soon as array operation is considered, antenna cross-talk is likely to violate this assumption. In addition, a dense antenna array requires the antennas to have small outer dimensions. This can be achieved by dielectrically scaling the antennas, which -in turn -leads to a low (7 Ω) feeding point impedance. The latter has to be interfaced by the subtraction circuit. The topology shown in Fig. 18 is a first approach towards an analog subtractor which provides appropriate single-ended inputs to interface with both, a dielectrically scaled antenna and 7 Rhode & Schwarz SMJ100A 8 Rhode & Schwarz FSV the DAC. In this implementation, noise figure is traded against linearity, as input signals close to 0 dBm might occur. For its implementation, component parameters have been determined by a semi-automated procedure, in which the input stage -a common base configuration -was optimized with respect to input power matching, while an upper bound for min NF was respected and noise matching was clearly observed. As in the case of the feedback-sampling amplifier according to Fig. 15, signal subtraction is performed in the current domain using a common output node. A printed circuit board has been designed to enable joint performance evaluation of this amplifier and the 7 Ω antenna. Due to the low feeding point impedance, separate characterization is less useful. To avoid problems involved in interconnecting devices with 7 Ω reference impedance, the amplifier should directly be attached to the antenna, which is supported by the board. Thus, both evaluation and refinement of this circuit will have to be performed in close collaboration with our partners from the ultraMedis project.

Introduction
The circuits introduced in this section serve for the M-sequence topology. They have been implemented in a cost-efficient 0.25 µm Silicon Germanium BiCMOS technology, which opens up new fields of ultra-wideband radar applications. In the following sub-chapters, the design of different hardware blocks for the ultra-wideband radar front-end is presented. The design of a multi-purpose M-Sequence generator is presented which acts as a pulse compression modulator and exhibits an up-conversion mixer. A highly efficient powerdistributed amplifier has been implemented utilizing a novel cascode power matching approach to achieve superior output power performance. Additionally, a fully differential broadband amplifier using cascaded emitter followers has been designed that exhibits a variable gain control and excellent broadband performance.

M-sequence generator
The well-known very broadband spectrum of M-sequences is widely used for testing the correct functionality of broadband integrated circuits, such as amplifiers, multiplexers, and transceivers. The run for higher data rates and amplifiers with broader bandwidth often outperforms commercially available test equipment and necessitates some sources to test these circuits. The measurement equipment vendors cannot supply data sources as fast as the technology evolves. The application which is targeted in this chapter is that M-sequences are used for pulse compression in ultra-wideband radar systems. For this application, it is important that the generator consumes little energy only, and it should generate a sequence of appropriate length (see (2)). Early high-speed PRBS generators for high data rates have been employed in III/V HBT technologies [28]. Moreover, a 110 Gb/s PRBS generator has been published in [29] using InP HBT technology with a transit frequency ( T f ) more than 300 GHz. Recently, several PRBS generator circuits have been published in SiGe bipolar technology for test purposes in fiber-optic communications. In [30] a 100 Gb/s 7 2 1  PRBS generator has been implemented in a 200 GHz T f SiGe bipolar technology. As in the 80 Gb/s 31 2 1  pseudo random binary sequence generator introduced in [31], the output of the shift register has been multiplexed to achieve a higher maximum data rate. However, these circuits have a power consumption of 1.9-9.8W and utilize cost-intensive high-end processes. A 4 23  Gb/s 7 2 1  PRBS generator with a power consumption of 60 mW per lane has been publicized in [32] utilizing a 150 GHz T f SiGe BiCMOS technology. A 7 2 1  multiplexed PRBS generator in 0.13 µm bulk CMOS exhibits 24 Gb/s output data rate [33]. In the following section, the circuit implementation with measurement results of the M-Sequence generator is presented.

Upconverted M-sequence generator
A simple way to generate M-sequences is to utilize a digital linear feedback shift register (LFSR), as depicted in Fig. 19. This device generates a binary pseudo-random code of length 2 1 n  , where n is the number of stages in the shift register. Feedback is provided by adding the output of the shift register, modulo two, to the output of one of the previous stages. The actual sequence obtained depends on both the feedback connections and the initial loading of the register.
The proposed architecture depicted in Fig. 20 consists of serially connected shift registers with the characteristic polynomial an additional XOR gate acting as a modulo-2 adder to yield the delayed sequence, and one multiplexer. The selected feedback in the proposed architecture enables to generate two M- sequences with a mutual shift of half the word length. Those are multiplexed to yield the same M-sequence at twice the data rate. The input for the multiplexer is set between the two latches of the fifth flip-flop highlighted Fig. 20. This leads to a phase shift of half the pulse width in order to achieve the maximum voltage swing at the input of the multiplexer. Thus, the proposed architecture makes it possible to boost the circuit performance at the cost of an additional adder and a multiplexer. The architecture is extended to provide the possibility of upconversion for the generated M-sequence. This has been facilitated by implementing a mixer core at the output of the multiplexed LFSR. The mixer performs a BPSK modulation of the 9 th order M-sequence signal generated by the multiplexed shift register. The circuit was implemented as an XOR gate instead of a conventional Gilbert cell as opposed to [34]. The actual circuit is nearly identical but the XOR operates in the limiting region compared to the small-signal operation of the Gilbert cell. The limiting behavior simplifies the design and requirements of the mixer, and results in lower power consumption. No emitter degeneration has to be implemented to increase linearity for large signal inputs. The XOR gate is driven by a LO buffer that can be digitally controlled to allow the generation of baseband M-sequence signals without the need for up-conversion. An additional output buffer with a resistively matched output has been included in order to control the output voltage swing in a wide range.

Modulo 2 adder Output
Once the functional simulations have been completed, each individual block is designed on the transistor level. As the multiplexing architecture has been chosen, the flip-flops only have to work at half the operation speed. Standard CML flip-flops have been designed consisting of two latches, which inhibit two differential pairs. A schematic diagram of a CML flip-flop is depicted in Fig. 21. The flip-flops used in the LFSR are designed to offer a differential output voltage of 2 300  mV. According to [31]

Simulated and measured results
The M-sequence generator has been simulated in time domain to find out the maximum data rate and to verify the correct function of the register. At clock frequencies higher than the maximum allowable clock frequency, the PRBS register does not work as expected and  Fig. 22 using the techniques described above. It can be seen that every bit can be distinguished from each other. A single bit has a slightly lower output voltage than a bit sequence with the same value, which is caused by the limited output bandwidth. As mentioned before, the mixer and the output buffer both have a limiting character which attenuates the flip-flop glitches. Thus, the waveform exhibits low ripple, actually at sequences with a series of equal bits, which indicates that the clock feed through is very low. However, the output waveform exhibits some deviations, which is quite common for circuits at this high data rate. This behavior may result from the slightly inductive behavior of an emitter follower in the signal chain. As long as the circuit is stable, this does not cause problems. The M-sequence generator chip is placed on a Rogers TMM10i ceramic substrate for wire bonding. In order to protect the circuit mechanically and keep the bond wires as short as possible, it is placed in a topside cavity and fixed utilizing an electrical and thermal conductive epoxy glue, as shown in Fig. 23. The thickness of the ceramic substrate has been chosen to be 381 µm, which is almost equal to the chip height of 370 µm and the glue. Thus, the distance between the substrate edge and the bond pad can be reduced. The continuous ground plane below ensures a good thermal conduction, and 1.2 mm thick 4 FR stabilize the brittle ceramic substrate.
The correct function of the generator can be checked through calculation of the normalized autocorrelation function of one complete M-sequence. The calculation of the ACF has been implemented in Matlab. As the correlation properties are of substantial interest for radar applications utilizing pulse compressed waveforms, the PRBS signal is measured with an Agilent DSO 91204A oscilloscope and compared with the simulation results. The simulated and measured 10 Gb/s waveforms together with the normalized cross correlation function of the measured signal are presented in Fig. 24.

Distributed power amplifier
The transmitted random sequence is subjected to high losses especially when transmitted through human body cells. Thus, a power amplifier is required to be placed directly before the antenna to increase the signal power. Distributed amplifiers (DAs) are appealing aspirants for UWB systems due to their inherently large bandwidth. The two major challenges in designing distributed power amplifiers are maintaining high linearity over the entire bandwidth, since narrowband linearization techniques cannot be utilized, and achieving high output power and efficiency. In order to increase the HBT distributed power amplifier performance, which is limited by the characteristics of the active cells used [35], alternative structures are investigated. The cascode cell is an appealing circuit due to its higher output impedance, higher breakdown voltage, and reduced Miller effect. Moreover, loading the two transistors by the required impedance for optimum power leads to an output power twice as high as compared to a single transistor. However, the conventional cascode configuration does not meet these conditions since the common base transistor's ( cb T ) low-input impedance restricts the output voltage excursion of the common emitter transistor ( ce T ). Therefore, it does not see its optimum power load impedance. In addition, the power performance of the cascode cell becomes one of the most important challenges to obtain maximum output power over the required bandwidth. To be power optimized, another series capacitor a C is inserted on the base of cb T to avoid its early power saturation compared to ce T . A small signal model of the modified cascode gain cell is depicted in Fig. 25. The input impedance of the common base transistor can be calculated as follows: The capacitor a C and input impedance In order to achieve higher gain and greater bandwidth, an additional inductor is added between the collector of ce T and the emitter of cb T . The influence of a 1.5 nH inductor and various capacitances on the voltage gain of the cascode cell is demonstrated in Fig. 26. The small-signal schematic diagram of the modified cascode cell with inductive peaking is presented in Fig. 27a. The output resistance of the modified cascode circuit can be written as The resonant frequency shows good agreement with the theoretical considerations set out in (27).
Another effect is that the output impedance of the cascode cell increases significantly from 2 to 15 GHz under the influence of the 1.5 nH inductor. The initial values for a L and a C are then optimized under large-signal conditions using nonlinear simulations of the inductively   peaked cascode circuit close to 1 dB compression point in order to obtain the maximum output power and efficiency. Accordingly, one single cell of a simple common emitter stage is used to synthesize the required ratio between the values of the inductor and capacitor. The test circuit is terminated with a 200 Ω resistor. The goal is to have equal deflections of the load lines both for the common emitter and the common base transistor. These results demonstrate that the proposed cascode configuration can obtain twice the output voltage swing compared to a single common emitter transistor at the same collector current so that twice the output power can be achieved.
A demonstrator chip has been implemented utilizing the methodology described previously. Fig. 28   The tapered collector line has been realized using staggered inductors 1 in order to achieve a coherent addition of the collector currents and a flat gain over the entire bandwidth. Biasing is implemented using three transistor current mirrors with ratio of 32:1 and a low dropout (LDO) voltage reference driven by a band-gap voltage source. A chip microphotograph of the complete circuit is shown in Fig. 29. The transistors are biased through the collector line by means of an external bias-tee. The bias point was selected at 5  VCC V and 2.6  VDD V . A power and ground grid facilitates a low impedance connection and -due to the low distance between the congruent metal grids -a large capacitor is shaped. The chip size of the amplifier circuit is 2.1mm².    The distributed power amplifier chip was tested via on-wafer probing. The measurements of the circuit were carried out using an Agilent N5242A PNA-X vector network analyzer. Fig. 30 shows the simulated and measured small signal gain S21 and input return loss S11. The traveling wave power amplifier exhibits a measured gain of 11 dB with a gain ripple of ±1 dB up to 12 GHz and a 3 dB bandwidth of 13 GHz. The simulated and measured output return loss 22 S and the reverse isolation 12 S are illustrated in Fig. 31. Both the input and output return loss are below -12 dB over the entire frequency range. The measured reverse isolation S12 remains below -35 dB. The circuit is unconditionally stable, also verified for large RF input signals. Fig. 32 shows the output power Pout and power-added efficiency PAE at a center frequency of 7 GHz. The 1 dB compression point is at 17.45 dBm with an associated power-added efficiency of 13.9%. The saturated output power sat P is 20 dBm and the maximum power-added efficiency is 22.1%.

Differential broad band amplifier
Broadband variable gain amplifiers are key components for ultra-wideband radar applications and important building blocks to increase the dynamic range. Especially for Msequence based radar systems without upconversion, the lower frequency range, which contains most of the signal energy [36], has to be considered. Biomedical and ground penetrating radar systems necessitate a lower frequency boundary of less than 1 GHz [37], [38].
Moreover, the broadband variable gain amplifier (VGA) should be fully differential. Great care has to be taken to avoid the distortion of the signal shape through gain ripple and group delay variation. In this section, the analysis, design and measurement results of a fully differential broadband VGA are presented. After some considerations about mismatching in broadband amplifiers have been made, the frequency behavior of cascaded emitter followers is investigated, and the implementation of a variable gain control is explained. Finally, the implementation of the broadband amplifier is presented, introducing the circuit architecture and presenting measurement results. The amplifier is  fully differential and based on a cascode configuration as depicted in Fig. 33. This is useful for high frequency circuit design, because this multi device configuration has small highfrequency feedback, achieved by the negligible Miller effect, and a large bandwidth. Driving the cascode stage with cascaded emitter-followers leads to an enhancement of bandwidth and provides dc level shifting [39].
The voltage gain of the cascaded emitter-followers has a frequency dependence that is similar to the frequency dependence of the transfer function of an RLC series resonance circuit [40]. This can be used to provide gain peaking at the desired frequency. The transfer function depends on the transistor parameters, the biasing current, the resistors, and the load. The main problem using emitter-followers to drive cascode stages is that the circuit might become unstable. This will be the case if the negative input resistance of the second emitter-follower stage becomes larger than the positive output resistance of the first stage at a certain frequency, which is shown in Fig. 35 The core of the broadband amplifier is a signal summing VGA as illustrated in Fig. 33, where the gain is controlled by applying an analog dc voltage at C V . The amplifier gain can be set from 0 to maximum gain whereby it behaves like a cascode differential stage when the control voltage is set to 0 and all current flows through the load resistors L R .
Furthermore, capacitive emitter degeneration is used to attain additional gain at high frequencies for a higher cut-off frequency. Tuning the values of e R and e C  , introduces trade-off between high gain, bandwidth and stability, because e C influences the capacitive load of the cascaded emitter followers. Inductive peaking is carefully applied using small inductors C L in the collector branches in order to avoid high group delay [41]. In order to achieve a high output swing, high currents in the differential amplifier are necessary.
The circuit is implemented in the 0.25 µm IHP SGB25V value technology. A chip photograph of the broadband variable gain amplifier is depicted in Fig. 34. The circuit elements composing the amplifier core have been arranged symmetrically to maximize the even mode suppression. The mixed-mode S-Parameters are measured on-wafer using 150 µm GSGSG probes.  Fig. 36a illustrates the differential simulated and measured gain Sdd21 as well as the input and output return loss at 100 Ω differential source and load impedance. The measured differential gain is 11.5 dB with a gain flatness of ±1.5 dB. The 3 dB cut-off frequency is 30 GHz, which results in a gain-bandwidth product (GBP) of 113 GHz which is 1.5 times the t f of the transistor. The corresponding measured and simulated group delay is shown in Fig. 36b. The measured group delay variation is 35 ps, which is higher than that in the simulation and also induced by the stronger resonance behavior. As depicted in Fig. 37, the amplifier gain can be adjusted between 0 and 11.5 dB. The large signal behavior is measured on-wafer. An output 1 dB compression point of 12 dBm has been measured up to 20 GHz.

Introduction of data capturing device using feedback principle
A straightforward data capture and digitizing can be directly performed by a conventional analog-to-digital converter (ADC). There are a number of limitations which arise using this method. The first and most crucial one is an inverse relationship between accuracy and speed of the conversion. In terms of the ADC, it is the inverse relationship between resolution and bandwidth. It is impossible to realize a high-speed ADC with the resolution which fulfils the sensor specification in modern technologies.
To overcome this limitation, a more complicated method of data capture based on "stroboscopic feedback loop" can be used. This method utilizes a feedback loop to relax accuracy requirements of the ADC (see [2] and chapter 6.2.3). The digital output of the data capturing device is represented by two summands: the value of the first summand is measured by the ADC; the value of the second summand is calculated based on its previous state and on the first one. The ratio between predicted and measured summands, i.e. between the resolution of the ADC and DAC can be calculated from the conversion efficiency of the both converters [42].
The block diagram of the data capturing device with feedback is depicted in Fig. 38. It consists of 3 logical parts, highlighted in colors in Fig. 38: Signal Processing, ADC and DAC, and LNA with subtraction amplifier. Although the subtraction amplifier belongs to the data capturing block, it has been integrated into LNA and moved to the receiver part of the sensor.
The data capturing device works as following:  A capturing block digitizes a difference (residue) between the received and predicted values. This function is performed by a high-speed low-resolution analog-to-digital converter.  A digital predictor evaluates the data from the ADC and makes a prognosis about the value to be expected next.  The predicted value is converted into an analog form with a high-speed DAC.  In analog domain, the predicted value is subtracted from the received signal with a subtraction amplifier.

High-speed analog-to-digital converter
The fastest type of the A/D converters is a full flash ADC. A block diagram of a typical full flash converter is shown in Fig. 39. It consists of a reference network, a bank of comparators, correction and encoding logic and test buffers. The challenges of the implementation of the ADC are usually related to the analog part of the converter, namely to the reference network and to the bank of the comparators. It is possible to implement the high-speed comparator in the selected technology which will satisfy all requirements, but the reference network is a bottle-neck of the converter.

Reference network
The task of the reference network is to provide equidistant reference voltages which will be further processed by the comparators.
There are two conventional implementations of the reference network. First, the simplest way is a Kelvin divider or resistor ladder. It suffers from several drawbacks, such as DCbowing, clock and input feed-through [43]. Furthermore, is not well suited for the highspeed ADCs.
A second configuration is a differential one. It consists of two branches; each has a driver loaded with a chain of serially connected tap resistors. Both branches are equal, only outputs of the second branch are "inverted" or mirrored with respect to the middle point [43]. The main problem related to the differential network is its bandwidth, which often becomes a bottleneck of the system. The reference network has to drive a big parasitic capacitive load caused by the bank of comparators. In the full flash ADC, it is one of the main limitations, because the number of comparators is doubled when increasing the resolution by 1 bit.
The second problem of such network is the non-equal transfer characteristic of the output nodes [42].

Proposed bandwidth enhancement technique
Drawbacks of the conventional differential reference network are mainly due to its serial configuration; a change in one component will affect the others. This inherent property of the serial connection makes individual adjustments and compensations impossible. To overcome this limitation, a new configuration of the reference network is proposed. An idea is to build the resistor network in a segmented serial-parallel configuration and substitute one driver (emitter follower) with several drivers, connected in parallel. A full overview of the possible configurations is described in [44]. Among this variety one configuration should be highlighted, namely the configuration illustrated in Fig. 40 where each segment contains one tap resistor and one current source. The reference network is fully parallel, thus allowing the maximum speed to be achieved.
The main feature of the parallel network is the flexibility to choose component values. This freedom gives the possibility of equalizing the bandwidth of an individual segment that leads to the optimal speed at given power dissipation.  Fig. 40 shows the case when all driver currents are equal. In practice, it is more useful not to keep the currents in all segments equal, but to equalize the bandwidths in each segment instead. The network, however, does not only present good advantages, it also has some drawbacks. Flexibility of adjusting different parameters leads to different geometries of the resistors. In the case of the conventional network, all resistors have the same value and the same geometry. Proper layout minimizes the mismatch between them. The proposed network cannot benefit from this feature.

Design of comparator
Signals from the reference network are led to a bank of n comparators. Comparators decide if the input is above or below the reference. For decreasing the probability of errors, a master-slave comparator with a preamplifier is used. An overall schematic diagram is shown in Fig. 41. The role of the preamplifier for the comparator is twofold: It works as a limiting amplifier, and it provides an additional amplification of the input signal. Another important function of the preamplifier is isolating the reference network from kick-back noise, produced by the master latch. In this particular example, the Cherry-Hooper amplifier with emitter follower feedback is used as preamplifier.
The master latch has an auxiliary current source Iaux. This current source prevents the crosscoupled differential pair from being completely switched off, thus keeping the base-emitter capacitance charged. The time to charge this capacitance is decreased, and as a result the overall speed of the latch is increased. The Iaux has to be sufficiently small because it adds hysteresis which decreases the sensitivity of the comparator. Setting the value of Iaux equal to 10 % of IEE2 is a good compromise between speed and sensitivity. In the slave latch, there is no auxiliary current source because the input signal of the slave latch is relatively large, and an auxiliary current source does not have a strong influence as in the case of the master latch.

Experimental results
The ADC with the proposed parallel reference network was implemented in 0.25 µm SiGe BiCMOS technology. The Chip micrograph of the ADC is depicted in Fig. 42. Static measurements: For measuring static errors of the ADCs, a low frequency 50 MHz sine signal was applied to the input of the converter at 5 GS/s sample rate. A deviation of a transition from the mean value, the differential nonlinearity (DNL), was calculated for each step. A cumulative sum of differential errors represents the integral nonlinearity (INL). The results are graphically presented in Fig. 43. Dynamic measurements: Signal-to-noise and distortion ratio (SINAD) of the test circuit was measured over the frequency range up to 6 GHz at a constant sample rate of 15.01 GS/s. The small frequency offset of 10 MHz was made to accumulate quantization errors over the whole dynamic range. The measurement results are presented in Fig. 44, which shows SINAD of the converter up to the input frequency of 6 GHz. The dashed line shows a level where SINAD drops 3 dB below its value at low frequency. The frequency where SINAD crosses the 3 dB line indicates the effective resolution bandwidth of the converter, which in this case is greater than 6 GHz.

High-speed predictor
The main function of the predictor of predicting the part of the received value was described above. The predictor also carries out two additional functions:  Making subsequent averaging of the digitized values, increasing signal-to-noise ratio of the measured signal.  Decreasing the data throughput for further data processing.  The functional diagram of the predictor is depicted in Fig. 45. The predictor consists of the memory where the data from the ADC are accumulated; an averaging block, which makes averaging of the accumulated data; and a block where the output DAC value is calculated. An algorithm to calculate the DAC value is a modified version of the successive approximation algorithm with a constant ±LSB step.
The described functionality is coded using VHDL language and implemented using ECL library available in IHP BiCMOS Technology. For speed purposes, the predictor was divided into several sub-blocks which were implemented separately. This method decreases the complexity of the separate sub-block, and achieves a higher operational speed. The block diagram of the predictor is depicted in Fig. 46. The predictor consists of a demultiplexer, a bank of predicting blocks and a multiplexer. A predicting block carries out three functions: accumulation, averaging, and prediction. The demultiplexer deserializes the M-Sequence and commutates M-Sequence parts (chips) to the individual predicting blocks so that each has to work with only one defined chip. The multiplexer reverses the parallel processing and serializes the predicted values which finally fed the DAC.

High-speed digital-to-analog converter with off-chip calibration
The digital-to-analog converter transforms a digitally predicted value into the analog domain. To prevent information loss, the accuracy of the DAC should correspond to the accuracy of the whole capturing device. Simultaneously, the DAC should work at 10 GS/s. To satisfy both requirements, the converter is implemented using a segmented current steering architecture. The block diagram of the converter is depicted in Fig. 47. It consists of the two segments: A unary sub-converter and an R-2R sub-converter. The current sources of the both sub-converters are connected to the summing node. As will be seen later from measurements relying only on technology, component matching would give insufficient accuracy, which in this particular case is 10 times lower than required. Therefore, an additional calibration of the current sources is implemented. The current sources are realized as voltage controlled current sources. The controlling voltages are produced by auxiliary low-power µDACs which are externally controlled via SPI interface.
The calibration algorithm could be characterized as successive approximation of the DAC output to the reference value. The detailed calibration flow of the each current source is as follows: 1. The current source under calibration (CSUC) is disconnected from the summing node.
For this purpose, the corresponding digital input is applied to the DAC. 2. The analog output of the DAC is measured and stored in memory as "zero-value". The measurement is performed with a 14-bit ADC on an FPGA board. 3. The CSUC is connected to the summing node. 4. According to a binary search algorithm, the MSB of µDAC is set to "1". 5. The output of ADC is measured again, and the difference between the stored "zerovalue" and the measured value is calculated. 6. Depending on this difference, the decision concerning the value of the MSB of the µDAC is made. 7. Steps 4-6 are repeated for the remaining 9 bits of µDAC. 8. Steps 1-7 are repeated for each current source.
A set-up to implement the proposed calibration scheme is depicted in Fig. 47. The calibration algorithm is implemented on a Spartan-3AN Starter Kit board.   Fig. 48 where both DNL and INL values before and after calibration are given. INL errors were also recalculated in percent of the input range. To   Dynamic measurements: Dynamic characteristics of the DAC were measured together with the 4-bit ADC under the assumption that LSB usually works faster than MSB. The direct measurement of the spurious free dynamic range (SFDR) has no practical sense since the ADC limits the overall performance. For estimating the performance, an envelope test was applied [45]. Proper work of the converters assumes the presence of the all transition steps at the frequency of interest. Fig. 50 shows DAC outputs at 5 GHz and 5.5 GHz. Both converters (ADC and DAC) have all 16 transition levels up to 5.5 GHz input. Only the amplitude at 5.5 GHz starts to decay.

Conclusion
The design and measurements of the high-speed data capturing device for the M-sequence sensor are described in this chapter. The data capturing device utilizes the "stroboscopic feedback loop" for achieving high dynamic range together with high sampling rate.
A number of different techniques are used to achieve the desired performance of the separate components.
To achieve a high effective resolution bandwidth of the analog-to-digital converter, the new segmented reference network was proposed. The new network, implemented in the ADC [46] allows increasing the effective resolution bandwidth several times compared to the similar conventional one [47], while the power dissipation is only slightly increased.
The high-speed predictor was described in VHDL and implemented using a high-speed ECL library. Despite the disadvantage of the power dissipation, the ECL implementation allows speeds of up to 10 GS/s to be achieved. Furthermore, it is simple to modify the predictor to comply with different system parameters, such as the M-sequence length or averaging factor.
An off-chip calibration was implemented for the high-speed digital-to-analog converter. The calibration is implemented on an FPGA-board. After having been modified slightly, it could be integrated into the DAC. The static errors of the DAC after calibration are lower than 0.15 % which allows the use of a converter in the data capturing device with a target resolution of 9 bits.

Introduction
While previous sections were aimed to discuss specific sub-components such as individual semi-conductor chips of an UWB-sensor, we would like to consider some aspects of the whole sensor electronics here. For that purpose, several M-Sequence devices were implemented at different integration levels, and some Ukolos-partners (ultraMedis, CoLoR) were provided with demonstrator devices for their own use. In order to have a running sensor system, the device implementation has to cover the whole manufacturing cycle from chip-design and manufacture, chip housing, RF-PCB-design and assembly, design and implementation of the digital components (ADC, FPGA, interfaces etc) up to the programming of sensor internal pre-processing, the data transfer to the host PC and application-specific software for data evaluation and visualization. Furthermore, device specific test and evaluation methods and routines had to be developed and implemented in order to perform high-resolution device characterization (e.g. [48]) In what follows, we will first introduce an experimental device which is aimed to evaluate new concepts or modifications under real conditions. Secondly, we refer to a device configuration which implements the principle depicted in Fig. 4 for the practical use by other Ukolos-projects and finally, there will be some discussions toward single chip solutions.

Device concept and aim
The aim of an experimental demonstrator device is to investigate the impact of individual sub-components on the performance of the whole device, as well as to have the opportunity to flexibly perform device modifications without the need of redesigning complex RF-PCBs. The device is organized in a modular concept as symbolically depicted in Fig. 51. Fig. 52 shows an example of a demonstrator implementation of such kind.
The individual sub-components as e.g. shift register for stimulus generation, T&H-circuits, RF-power distribution, RF-synchronization etc. are organized as plug-ins. Hence, one can simply replace a device component by a new one if improved circuits, better IC-housing or RF-PCBs are available. Furthermore, the various modules may be interconnected to different device structures as shown in Fig. 4 or Fig. 55.

Demonstrator performance
The particular RF plug-ins of the demonstrator are designed to operate with signals of large fractional bandwidth at the lower end of microwave frequencies or with toggle frequencies up to about 20 GHz. The generator unit provides periodic M-sequences of length 2 m -1, where m represents the order of the sequence. The demonstrator has optionally implemented 9 th or 12 th order generators, which accordingly produce signals with periods 511 or 4095 chips. The generator plug-in operates with toggle rates between 500 MHz and 20 GHz for the 9 th order M-sequence, and the 12 th order device may be operated between 500 MHz and 16 GHz. In the case of radar applications, the unambiguity range (4) of the measurement may cover values from 3.8 m (related to 9 th order M-sequence and 20 GHz clock) up to 1.2 km ( 12 th order M-sequence and 500 MHz clock).
The clock synchronization unit which precisely defines the receiver sampling points is a 9 th order binary divider with a maximal toggle rate of 24 GHz. Random fluctuations of the sampling point (jitter) could be reduced down to some tens of femtoseconds [48] due to the balanced circuit topology and the optimized architecture of the timing system (see [2], [49] detailed discussions). Note, that the time position uncertainty of the measured impulse response (compare Fig. 5) is father decreased as consequence of the impulse compression (i.e. correlation; see [2] for discussion).
The clock distribution plug-in is an active device which recovers and distributes the sampling clock among the receivers and the analog-to-digital converters. The unit can handle clock pulses with 20 ps falling/rising edges and features wideband reverse signal rejection better than 40 dB per branch.
The receivers are ultra wideband sampling gates with an 18 GHz analog input bandwidth, better than -40 dB signal feed-through over the full bandwidth, -15 dBm input compression points and a decay rate of about 20 % per ms relative to full scale (i.e. 5…200 ppm per sampling cycle depending on the clock rate (0.5 -20 GHz) of the system). Other potential components of the experimental demonstrator are discussed in sub-chapters 3 to 5.
The transmitter-receiver and receiver-receiver cross-talk is better than 130 dB over the full operational band. In order to achieve this value, attention was paid to RF-housing, clock signal distribution and power supply decoupling (see also Fig. 57). The recent configuration of the demonstrator RF electronics is able to handle (internally) up to about 70 000 IRFs per second ( 9 th order M-sequence at 18 GHz system clock). The data transfer to a host PC (based on commercial standard interfaces like USB and LAN) reduces, however, the actually achievable update rate to about 300 IRFs/s. The corresponding gap is filled by synchronous averaging in order to use the available data amount for noise suppression. The achievable receiver dynamic is about 114 dB @ 1 IRF/s. It has to be noted that device non-linearity is classically qualified by the intercept point which is based on a Taylor-series model of the device under test and sine wave stimulation. In order to keep this established philosophy, the approach was extended to wideband signals [48]. This is illustrated by Fig. 53. In the example at the top, the Tx-and RX-port of an Msequence device were connected via a variable attenuator and the impulse response was recorded for attenuator values between 0 and 120 dB. In the case of very weak input signals (large attenuation), we can only observe noise and device internal cross-talk. If we reduce the attenuation, the wanted signal peak (it is called "main pulse" in Fig. 53) appears and increases linearly with the signal level while the cross-talk level remains constant. By reducing the attenuation further, other signal parts become to protrude from noise. They also increase linearity at the beginning. These signals are caused from device internal reflections, deviations from the ideal time shape of an M-sequence and misalignments of the ADC timing (refer to Fig. 54). We call them device internal clutter. For very high signal levels, the receiver will tend to saturate which leads to the compression of the main peak and the internal clutter signals. Furthermore, the appearing non-linear distortions create new signal peaks which leave an apparently chaotic mark (see [2] and Fig. 1 for details). While the cross-talk and the internal clutter may be removed by device calibration [50] since they are caused by linear effects, the non-linear distortions should be avoided by respecting corresponding input levels of the measurement signal.
The level diagram at the bottom of Fig. 53 refers to the non-compressed receiver signal. It shows the strength of the linear, quadratic and cubic signal parts in dependency from the signal power (see [48] for details). The effect of ADC timing misalignment is illustrated in Fig. 54. Theoretically, the ADC could capture the voltage sample at any time point within the hold phase of the T&H-circuit since by definition the signal level should keep a constant value during the hold interval.
Unfortunately this is not case as demonstrated by Fig. 54. Here, the impulse response (i.e. the correlation function) of the M-sequence device was recorded by insertion a variable delay between the start of the hold phase and the trigger of the ADC. Ideally, we should see only a single pulse as long as the ADC is triggered within the hold phase and noise within the track phase (which is however out of interest here). But actually, some spurious signals appear whose strength and time position depend on the delay between "hold-start" and ADC-trigger. Hence, by selecting a reasonable delay between T&H and ADC, we can minimize these spurious signals.

M-sequence feedback-sampling
This sub-chapter gives an example of the usefulness of the modular experimental device. It deals with feedback sampling. Feedback loops have been used for a long time in sampling circuits. However, they were usually restricted to sequential sampling having very large Nyquist rates so that only minor signal variations between consecutive samples appear. Only these variations are captured by that approach (see [2] for details).
In our case, this simple method cannot be applied since the voltage steps between two consecutive samples may cover the full receiver input range as we firstly apply Nyquist sampling and secondly, the natural order of the data samples may be disrupted due to interleaved sampling. Hence, we need some modifications of the principle which pose some challenges to the practical implementation.
For the purpose of feedback sampling, the data capturing & control unit was additionally equipped with a digital-to-analog converter which has to provide the feedback signal. The principle and the device structure are depicted in Fig. 55. The idea behind the digital feedback sampling implementation is to deal with high-speed signals (analog and digital) of low dynamic range (i.e. low amplitude) and to exploit the fact that the temporal variations of the scenarios under test are of the orders smaller than the measurement speed. This implies for the radargram (see Fig. 55, on the left) that adjacent samples at a horizontal line undergo only minor variations (instead of consecutive samples in sequential sampling). Thus, it will be possible to predict the measurement values along the observation time axis. This is the reason to insert a DAC into the feedback loop which converts the predicted digital values into analog ones. If the predicted signal levels are subtracted from the received signal, only the prediction error has to be captured by the ADC and processed by a digital high-speed system.  In the open loop example (above), we can observe that the hold voltage jumps from sample to sample. Hence, the ADC must be able to convert voltages within a large range. The second example shows the closed loop operation. Now, the predicted value is subtracted before AD-conversion, and we actually get a voltage during the hold phase which is always at about the same level. Under optimum conditions, the magnitude of the prediction error is determined by the strength of random noise which is usually quit weak. Therefore the requirements onto the dynamic range of the receive electronics can be relaxed.
Under optimum conditions, the magnitude of the prediction error is in the same order as random noise. Therefore, the demands made on the dynamic range of the receive electronics can be relaxed.  Fig. 57 shows a photograph of a primary (1Tx 2Rx) M-sequence RF board and corresponding ADC PCB with PC Interface (USB). The RF board is designed for assembly with the HaLoS-project originating ICs. Each of the board layouts corresponds to the architecture shown in Fig. 4, so that both boards connected together represent the basic Msequence working unit. This unit is considered as main integral part of the UWB devices provided for partner projects within the UKoLoS-and other scientific projects.

Single-chip sensor head
The ability to create an optimized multi chip sensor is apparent, but the manufacturability of such system is much more difficult with a longer parts list and more complex assembly as for instance in the case of the construction of complex MIMO sensing systems (see 8Tx 16Rx system in the chapter 11). One promising way is to realize all active high-frequency system components (i.e. components on the primary RF board -see Fig. 57 left-hand side) onto one chip. This will enhance the overall system performance, reliability, robustness and assembling yields. By contrast, however, increased complexity on the single die means more second-order effects that have not been studied so far. For example, undesired on-chip coupling interactions between the different constituent system components become more pronounced and are more challenging to manage especially because of dealing with ultrawideband signals. Such unwanted signal coupling or cross-talk can degrade the performance of the sensitive receive circuitry and, consequently, of the whole system. The aim to study such interactions which have not been considered so far, the expected advantages but also the knowledge gained from multi-chip approach analyses, have motivated the first monolithic integration of the complete RF-part of the M-sequence UWB radar electronics into one silicon die. Fig. 59 shows the simplified block topology of the realized M-Sequence based single-chip transceiver head (alias System-on-Chip, SoC head). In correspondence with the system topology depicted in Fig. 4, the M-sequence transceiver SoC contains one transmitter and two receiver circuits (commonly assigned as 1Tx 2Rx configuration).

UWB single-chip head architecture
According to our experience, the 1Tx-2Rx topology of the primary sensor cell represents the optimum regarding achievable performance and circuit complexity. Moreover, the implementation of 1Tx 2Rx structure on one die has the advantage of permitting both crosstalk investigations between active and passive circuit parts (i.e. transmitter and receiver) as well as between two passive parts (i.e. receiver 1 and 2). From a practical point of view, the stand-alone 1Tx 2Tx devices are suitable for implementations where two receive channels are needed a priori, e.g. for simple localization tasks or in material testing (see chapter 11) in which the second (slave) receive channel can be used for device online calibration purposes. The desired MIMO usability as for instance in novel UWB-arrays for high-resolution nearfield imaging (ultraMedis) or localizations (CoLoR) with 1Tx 2Rx constellation of primary sensing cells is also given.

Design philosophy
It is apparent that the presented single chip architecture envisages ultra-fast switching cells (i.e. stimulus generator and synchronization unit) with their relatively high signal swing output buffers as well as very sensitive analog input blocks integrated on the same chip substrate. So, the undesired signal coupling or cross-talk can degrade the performance of the sensitive receive circuitry and, consequently, of the whole system. Especially in the case of analog devices, which handle the ultra-wideband signals, the on-chip interferences can be catastrophic. For example, intermodulation/interaction of noise components with the measured signal within the frequency band of interest may cause device saturation. Therefore, special emphasis is put on the isolation of the SoC channels during the design phase ,as discussed in [2] or [49].

Individual functional block peculiarities
Particular functional circuit cores of the SoC transceiver components are designed to fulfill at minimum the parameters of the demonstrator plug-in blocks discussed above. Additionally, the SoC transceiver includes additional build-in options to open further functionalities as e.g. (equivalent time) oversampling [51] or frequency conversion in order to meet the UWB radiation rules [52]- [55]. In particular, the SoC concept is intended for very wideband material investigations and MIMO-applications like in medical microwave imaging [56]- [61].
In summary, the goals of the single-chip integration are:  to improve the synchronization between transmitter and receiver due to shorter interconnections with steeper signal edges,  to provide means of a flexible adaptation of the operational frequency band by introducing a wideband modulator,  to save power consumption by avoiding power hungry PCB-interconnection lines,  to investigate broadband signal leakage on chip and cross-talk due to the housing,  to avoid temperature effects on calibrated sensor systems due to temperature difference between the measurement channels and temperature expansion of device internal cables.
Thus, to achieve the desired MIMO usability, the shift register may be enabled and disabled, and transmitter buffers can be switched off (power down) by simple TTL-signals so that no external RF-switch is required to operate in a MIMO system. The transceivers are designed in such a way that they may either work while being driven individually or they may be cascaded with respect to the master system clock so that all units of a MIMO array work synchronously. Once the array is calibrated, a power down feature will be used for active transmitter selection. Thus, all receivers of MIMO array work in parallel and capture permanently data in order to get maximum measurement speed. As shown in Fig. 59, the transceiver IC is equipped with a wideband multiplier which optionally allows the sensor stimulus frequency band to be shifted and doubled ( [2] or [49], [55]) or the operational band to be adapted to a specific application [50] in conformity with regulation requirements [52] - [54]. The channel is designed for operation up to 18 GHz.
Moreover, the multiplier can invert the stimulus M-sequence by implementing simple ECL signals on the control port. This feature may be useful to provide uncorrelated transmit signals in MIMO arrays. In addition, the sampling timing control chain is equipped with optional switchable shunt path. This add-on allows direct clock supplying from chip periphery. Thereby, user-selectable sampling rates or enhanced signal capturing approaches (e. g. equivalent time oversampling approach [50]) are possible without IC redesign. The analog receivers are designed to operate with wideband signals up to 18 GHz. The maximum linear operation input signal peak-peak swing is 60 mV. Fig. 60 shows the chip die micrograph of the discussed transceiver with marked particular functional blocks and well visible top metal of a decoupling guard between the transmitter and receiver (line in the middle). The transmitter and receiver cores as well as their particular I/O pads are placed on the opposite die sides to minimize mutual on-chip coupling as well as inductive coupling between the bond wires after packaging. As extensively discussed in [62] or [63], [64], the decoupling guard is a guard well in a trench between the noisy transmitter and sensitive receivers. In the final assembly, the guard is connected to the quiet potential in order to fix the voltage of the substrate between the Tx and Rx die part by absorbing potential substrate fluctuations. The transceiver die occupies an area of about 2000 1200 µm µ m  and the build-in circuits consume in total about 300 mA from 3 V supply.

Single chip transceiver head evaluation
For the sensor head prototype evaluation, the transceiver chip has been measured on wafer as well as packaged with well-established chip-on-board technology using an optimized composite 4-layer carrier made from Rogers 4003C™, FR4 laminate and ultrasonic bonding procedure with 1-mil aluminum wires. The bond wire landing areas for RF ports on the board are designed to match as closely as possible (i.e. realizable) the pitch on the IC to avoid long wire connections. The cavity approach has not been implemented because of challenging technological realization on the selected carrier board. Fig. 61 shows the test board whereat the wired die is zoomed out for better visualization. The die is located in the center of the photo and top glue is used to protect the bond wires. It is mounted on a metal patch which is connected with VEE. This allows a direct connection between the substrate and the board's lowest potential. The top layer is mainly used for RF signal routing whereas the bottom layer is used for control lines. The inner plane layer below the top layer is the common board GND, and it also provides the GND reference for the RF signal lines. The other inner plane is the supply layer. The supply is bypassed to the GND with a 0.1 µF ceramic capacitor placed as close as possible to the die.
A basic test set-up for the sensor head prototype parameter evaluation is symbolically depicted in Fig. 62. The photograph on the right side shows an example of such test assembly. The evaluation board is connected to a 10-bit data "digitizer" (ADC), FPGA control and pre-processing unit which is equipped also with the PC interface. Moreover, for the parameter characterization, stable sinusoidal reference has to be connected to the system clock port (not shown in the photograph). This signal comes on board through an SMP connector and toggles the analyzed assembly. The toggle rate for the packaged prototype can be chosen quite flexibly between 0.5 and about 19 GHz, which implies a good compliance with the actual and intended applications needs.  Interchannel cross-talk plays an important role in many applications. As Tx-Rx-decoupling up to 130 dB could be reached if the individual components are properly shielded (see Fig.  57, left), an interesting question is how the single chip devices behave with respect to that problem even though decoupling design techniques are implemented [2]. Fig. 63 shows the results for on-wafer measurements and the housed chip. Obviously, the chip design outperforms the quality of the chip wiring with respect to the cross-talk performance. The impulse response function of the housed chip is also shown in Fig. 63 (left). It was gained using the configuration as depicted in Fig. 62 (left). The cross-talk pulse can clearly be identified. However, it should be noted that it can largely be suppressed by post-processing via system calibration. Thus, achieved spurious free system dynamic range is comparable to that of the demonstrator device. Figure 63. Example of normalized IRF captured with a 9 th order M-Sequence based experimental single chip assembly and broadband measured TX to RX isolation.
As a result, it can be concluded that we have successfully realized a novel functional hardware platform, both multi-chip and single-chip based, for current (e.g. [56], [60], [61]) as well as future scientific investigations in the complex field of an ultra-wideband MIMO sensing and localizations.

Summary
Electromagnetic sounding for non-destructive and remote sensing, respectively, has been exploited for a long time. However, its practical application was mostly restricted to narrow-band sensors or it was banned to the laboratory in the case of wideband examinations. The reason for this limitation has been the lack of reasonable wideband measurement equipment.
The first field deployable ultra wideband devices were used in ground-penetrating radar (GPR). They mostly exploited powerful nanosecond or sub-nanosecond pulses to feed the transmission antenna. Meanwhile, several other UWB-sensor techniques have been introduced. Section 2 summarizes the most popular of them. The challenges of corresponding research and development are mainly to be seen in the performance improvement of the sensor electronics and its monolithic integration aimed at cost and power reduction.
The main part of the chapter deals with a pseudo-noise UWB approach and its main components. The pseudo-noise concept is an interesting alternative to other wideband sensing principles promoting both high device performance and monolithic integration. Due to its simple and rigid synchronization, it provides exact and time-stable signal generation and signal capture which promotes:  simple adaptation of bandwidth, signal duration (period duration) and recording time to the needs of the actual application,  the implementation of large MIMO-arrays,  data processing in the time and frequency domain,  device calibrations as usually with network analyzers,  high-range precision and super-resolution capabilities, and  excellent micro Doppler performance.
The most relevant RF components of a pseudo-noise sensor cover the test signal generation (i.e. pseudo noise code), the analog handling of the receive signals, and the high-speed conversion of the analog signals to the digital domain. Device concepts suited for these tasks are discussed in sections 3 to 5. Due to special requirements set by the application and the applied semiconductor technology, innovative solutions are presented. Among those are a distributed power amplifier with a novel cascode gain cell, new subtraction amplifiers, an analog-to-digital converter with a new reference network, and a high-speed predictor. Also, appropriate verification schemes are presented. A final section referring to implemented devices as they were applied in other UoKoLoS-projects suggests some first steps toward a fully integrated pseudo-noise sensor device.