Random Sampling-and-Averaging with Variance-Reduction Techniques in Quantum Measurements

Tzu-Chien Hsueh

doi:10.5772/intechopen.1002311

Abstract

This chapter reviews the technical evolution of random sampling-and-averaging (RSA) technique associated with variance reduction (VR) in the time-interval measurements of emerging microelectronic and quantum applications. First, the theoretical analysis of the RSA technique based on stochastic Monte Carlo methods is elaborated for power-efficient and highly-accurate signal and photon detections, including both synchronous and asynchronous RSA measurement techniques with superior time-domain detection resolution, scalable dynamic ranges, high linearity, high noise-immunity, and low power/area consumption. Second, to further enhance the conversion-rate of the RSA measurements, the theoretical expectations, variances, and correlation coefficients of two RSA-compatible VR techniques, self-antithetic and control-variate VRs, are comprehensively derived and expressed in the mathematical closed-forms for practical integrated-circuit (IC) hardware realization.

Keywords

antithetic variate
control variate
correlated random variable
independent and identically distributed
joint probability density function
Monte Carlo method
quantum probability amplitude
stochastic random sampling
time-correlated single-photon counting
time-domain modulo operation
time-to-digital converter
variance reduction

Author Information

Show +

Tzu-Chien Hsueh*
- University of California, San Diego, La Jolla, USA

*Address all correspondence to: tzhsueh@eng.ucsd.edu

1. Introduction

Today, time-correlated single-photon counting (TCSPC) [1, 2] plays as the primary functionality in various emerging microelectronics and quantum technology. Different TCSPC architectures have their own pros and cons in multiple performance specifications depending on the requirements of time-to-digital conversion-rate or conversion-accuracy, which can roughly categorize TCSPC applications into two major areas as follows: (a) Quantum imaging, quantum sensing, positron emission tomography, time-resolved spectroscopy, fluorescence-lifetime imaging, light detection-and-ranging, and time-of-flight sensing, etc. employ small-area and low-power time-to-digital conversion (TDC) implementations with the disadvantage of low resolution, low accuracy, and high clock-generation power. (b) Quantum cryptography, Q-bit-state probability amplitude measurements, live-cell and tissue microscopy, and molecular imaging, etc. mainly exploit high-resolution TDC implementations with the downsides of low conversion-rates, high calibration complexity, and high-order digital filtering. In the long run of quantum-technology development, the demand for supporting both high-speed and high-resolution with low power/area consumption will be the common direction of all TCSPC realization approaches.

Therefore, this chapter reviews the theoretical analysis of random sampling-and-averaging (RSA) technique based on stochastic Monte Carlo methods used to perform high-accuracy TCSPC functionality with the downside of slow conversion rates. Then, two RSA-compatible variance-reduction (VR) techniques, self-antithetic and control-variate VRs, are elaborated in this chapter as well mainly for enhancing the conversion-rate of asynchronous RSA to realize a unified RSA-based TDC architecture for both categories of high-speed and high-resolution TCSPC applications. To verify the feasibility of the VR techniques, this chapter comprehensively summarizes the theoretical expectations, variances, and correlation coefficients in the mathematical closed-form expressions for practical integrated-circuit (IC) hardware realization.

2. Random sampling-and-averaging techniques

Monte Carlo methods have been broadly used in the fields of applied mathematics and financial engineering [3]. The main focus of this chapter, RSA measurement technique, was originated from the principle of Monte Carlo methods based on the analogy between and volume statistics and probability. In the field of stochastic processes, probability density functions (PDF) formulize the concept of probabilities to define the volumes of possible outcomes. Meanwhile, the Monte Carlo methods obtain the volumes from experiments and then interpret the volumes as probabilities. The relationship between the theoretical probability and experimental Monte Carlo method is summarized in Eqs. (1)-(3) by examining the expectation and mean values of a random variable, Y [4].

EY=∫−∞∞y∙fy∙dyE1

Y¯=1N∙∑n=1NYnE2

EY=EY¯=limN→∞1N∙∑n=1NYnE3

N is the total number of the samples; Y_n represents the n-th experimental sample of Y; E[Y] represents the expectation of the random variable Y based on its PDF, f(y); Y¯ is equal to the mean of Y_n and the Monte Carlo estimate of the expectation E[Y]. If N increases toward infinity, Y¯ will converge to E[Y] as well as the expectation of Y¯, E[Y¯]. On the condition that Y is an independent and identically distributed (I.I.D.) random variable, the variances Var[Y] and Var[Y¯] and standard deviations σY and σY¯ of Y and Y¯ can be respectively expressed by Eqs. (4) and (5) [4].

VarY=σY2=EY−EY2=∫−∞∞y−EY2∙fy∙dyE4

VarY¯=σY¯2=EY¯−EY¯2=EY¯−EY2=Var1N∙∑n=1NYn=1N2∙∑n=1NVarYn=1N2∙N∙VarY=1N∙σY2E5

Eqs. (4) and (5) provide two pieces of important information as follows: (a) Since the delta between the Monte Carlo estimate and ideal expectation, (Y¯ − E[Y]), represents the “error” of the experimental approximation, the variance and standard deviation of Y¯ shown in Eq. (5) represent the degree of accuracy of the Monte Carlo method in estimating E[Y] with a finite sampling number N based on the Central Limit Theorem [4]. (b) The variance Var[Y¯], which is the power of the estimation error, decreases with the sampling number N. This is basically the Weak Law of Large Numbers (WLLN) [4], which ensures the Monte Carlo estimate Y¯ can eventually converge to the expectation E[Y] when the sampling number N reaches toward infinity. Theoretically, the WLLN is the foundation of the Monte Carlo method and/or the RSA technique to achieve high-accuracy TDCs with small amounts of hardware cost by simply increasing the sampling number N, but obviously with the penalty of time-consumption or slow conversion-rate.

To implement the RSA-based TDC in a single-photon time-interval measurement, the TCSPC system utilizes a time-to-voltage conversion (TVC) circuit, two identical voltage-controlled delay lines (VCDL), and an edge combiner to convert the one-time captured Δt information, which is the quantity under measurement, into a periodic digital clock signal, CKτ, carrying a scaled version of Δt within each clock cycle to enable a process for an unlimited number of samples. The TVC first generates a single pulse, INT, whose pulse width equals the time difference, Δt, between the rising edges of the two pulses under the detection. Then, the INT pulse width enables an analog integrator implemented by a tunable constant current source, I_I, charging the integration and parasitic capacitors, C_I and C_P, to form a DC voltage, V_TVC. The time-interval, Δt, is converted and retained in the voltage domain as a differential DC voltage, ΔV = V_DD − V_TVC = K_TVC·Δt, where K_TVC is the conversion-gain of the TVC set by the magnitudes of I_I and C_I. A variable-gain amplifier (VGA) buffers the constant voltage information with its gain, K_VGA, to one of the following VCDLs. Because of the control voltage difference between V_DD and V_VGA, these two identical VCDLs generate two clock signals, CK₁ and CK₂, with a common frequency, 1/T, and a constant delay, τ = (K_TVC·K_VGA·K_DL)·Δt, where K_DL is the conversion-gain of the VCDLs. After a rising-edge combiner, the CKτ signal merged from CK₁ and CK₂ is a periodic pulse carrying the scaled time-interval information, τ, as its duty-cycle in every T. During this Δt-to-τ conversion-process, the maximum range of the time-interval measurement Δt_MAX is converted to the period of CKτ (T) so that the dynamic ranges for different time-interval (Δt) measurements can be set by the overall circuit conversion-factor, (K_TVC·K_VGA·K_DL), on the condition of a certain T.

At this point, CKτ is sampled by a single digital Data Flip-Flop (DFF), which is basically a 1-bit TDC process per sample. The sampling clock of the DFF, CK_DCO, is independently generated by a free-running ring oscillator digitally controlled by a pseudo-random-binary-sequence generator (PRBS) for random frequency modulation; i.e., a digital-controlled oscillator (DCO). With this random sampling frequency capability, the assumption of the uncorrelation [4, 5] between CKτ and CK_DCO can be guaranteed, and the waveform or voltage of CKτ at any time instant within T can have an equal chance to be sampled by CK_DCO to form a one-dimensional geometric probability density function [4]. Any sampled voltage from CKτ will be converted to either a Logic-1 or Logic-0 at the DFF output Y_n based on the sampled voltage larger or less than the intrinsic threshold of the DFF. After N_DCO sampling cycles, the accumulation of these N_DCO outcomes at the DFF output (Y₁, Y₂, …, YNDCO) can be equivalently treated as N_DCO times of the 1-bit TDC process all sampled within a single T due to the repeatability of CKτ; i.e., a modulo-T operation.

The RSA process can be theoretically described in different points of view as follows: (a) When the sampling number N_DCO is sufficiently large, the PDF of the DCO sampling instants should be uniformly distributed across the period of the CKτ, i.e., f_DCO(t) = 1/T, t ∈ [0, T), so in the time-domain the probability of obtaining a Y_n as Logic-1, P₁, is therefore the ratio of τ to T (= τ/T). (b) In the voltage domain, the probability function of Y, f(y), is a Bernoulli distribution [4] because Y_n has only two possible outcomes, Logic-1 or Logic-0, and their corresponding probability values are P₁ (= τ/T) and P₀ (= 1 − P₁), respectively. (c) The Monte Carlo estimate, Y¯, can be obtained by finding the ratio between the accumulated number of Logic-1s, N_Y, and the accumulated number of samples, N_DCO. The mathematical expressions of these three points of view are respectively summarized in the 1st, 2nd, and 3rd lines of Eq. (6) [6]:

EY=∫0Tyt∙fDCOt∙dt=∫0τ1∙1T∙dt=τT

=∫−∞∞y∙fy∙dy=1∙P1+0∙P0=P1E6

=EY¯=limNDCO→∞∑n=1NDCOYnNDCO=limNDCO→∞NYNDCO

VarY¯=σY¯2=1NDCO∙σY2=1NDCO∙EY2−E2Y=1NDCO∙P1∙P0E7

In Eq. (7), the variance Var[Y¯] (the power of the estimation error or quantization noise) degrades with the sampling number N_DCO and also is a function of P₁ and P₀ according to the WLLN, which is consistent with the variance of a Bernoulli random variable. Based on Eq. (6) and overall circuit conversion-factor, the time-interval Δt under each RSA measurement can be obtained by Eq. (8) as follows [6]:

∆t=τKTAC∙KVGA∙KDL≈1KTAC∙KVGA∙KDL∙NYNDCO∙TE8

As mentioned, the RSA is basically the average of total N_DCO of 1-bit TDC process, and its achievable accuracy is a function of the sampling number N_DCO. Therefore, the performance metrics of an RSA-based TDC can be reflected by the accuracy and conversion-rate of each RSA measurement result Y¯. From the standpoint of the accuracy, the quantization error Q of the RSA-based TDC process is also a random variable, and each sample of quantization error Q_n is the delta between Y_n and expectation E[Y] = τ/T. Because Y_n is either Logic-1 or Logic-0, Q_n also has two possible values (1 − τ/T) and (0 − τ/T) in its Bernoulli probability function f(q). It can be easily proven that Var[Y] = Var[Q], and Var[Y¯] = Var[Q¯] = P₁·P₀/N_DCO as expressed in Eq. (7). When only the quantization noise power (i.e., Var[Q¯]) is considered, the effective number of binary bits (ENOB) of the RSA technique can be expressed as [6]:

ENOB=SNR6.02=log10P02VarY¯0.602=log10P0P1∙NDCO0.602,P1≤12log10P12VarY¯0.602=log10P1P0∙NDCO0.602,P1>12E9

In Eq. (9), the signal power is either P₁² or P₀² based on the magnitude of P₁ compared to 0.5 due to the symmetric and signal-dependent variance property of the Bernoulli distribution. Conceptually, Eq. (9) can be examined by considering N_DCO = 1, then the RSA variance (Var[Y¯] = P₁·P₀) basically matches with the variance of a single sample experiment of flipping a coin. If the coin is fair, the probability of obtaining a single “head” or “tail” is 0.5 = P₁ = P₀ so the SNR = 0 dB and ENOB = 0 bit based on Eq. (9). By increasing the number of the experiment N_DCO, the quantization error power Var[Y¯] exhibits a slope of −3 dBW per octave of N_DCO. For example, to perform an RSA-based TDC with 12-ENOB accuracy, N_DCO must be as large as 2²⁴ (16.78 million) samples. An IC realization for this number of samples is actually not expensive at all because the RSA-based TDC only needs to accumulate 2²⁴ of the 1-bit data for averaging. From the standpoint of conversion-rate, it is determined by the DCO frequency and the sampling number N_DCO. In the same example mentioned above, an average 4-GHz DCO sampling frequency with N_DCO = 2²⁴ requires up to 4.19 milliseconds (ms) to complete a single RSA-based TDC. In high-speed quantum applications, 4.19 ms per pixel of single-photon detection could be too slow in terms of the deadtime and frame-rate requirements. One approach of boosting the conversion-rate is to exploit the multiple inherent phases of the DCO to simultaneously sample CKτ through multiple DFFs in parallel per DCO cycle. For example, a four-stage differential DCO can linearly improve the number of sampling phases by (4·2) × per cycle. This way can moderately increase the conversion-rate up to 1.91 kHz (8/4.19-ms) for a 12-ENOB RSA-based TDC. It is important to note that ENOB here only considers quantization error since the quantity under the RSA measurement is essentially a constant signal (DC). Any non-zero frequency (AC) noise induced jitter can be filtered out by the averaging process (i.e., integrator-based filtering); any phase-noise on sampling clock CK_DCO can actually improve the continuity (i.e., zero step size) of the random sampling PDF. Therefore, the IC implementation of the RSA-based TDC can possess both high noise-immunity and measurement-accuracy.

3. Synchronous random sampling-and-averaging

Synchronous RSA means the number of samples per T during each RSA sampling process is constantly set by a certain value of oversampling ratio (OSR) though the uniformly distributed PDFs of all CK_DCO sampling instants (or edges) are still I.I.D. Note that the definition of OSR in this chapter is defined by the number of samples per CKτ period (T), which is not the same as the OSR usually defined in the sampling theorem for anti-aliasing. Multiple examples of synchronous RSA are listed as follows: (a) OSR = 1/2, one CK_DCO edge uniformly samples two periods of CKτ (2·T). (b) OSR = 1, one CK_DCO edge uniformly samples one period of CKτ (T). (c) OSR = 2, one CK_DCO edge uniformly samples a half period of CKτ (0.5·T). (d) OSR = 4, each CKτ period is sampled by four independent CK_DCO edges, and each sampling edge possesses a uniform PDF bounded within one of the T/4 regions. In sum, the probability of each CK_DCO sampling edge occurs uniformly within the region of T/OSR. Under this assumption for synchronous RSA, though CKτ can be seamlessly and uniformly sampled by CK_DCO regardless of the values of OSR, however, OSR plays the key role of determining the accuracy of synchronous RSA. In reality, generating these well-bounded random sampling edges in IC implementation is challenging while the step size or resolution of the random sampling edges must be higher than the target ENOB of each synchronous RSA measurement. These realization issues can be resolved by asynchronous RSA elaborated in the next section. The assumption here is that the resolution of each CK_DCO sampling PDF for synchronous RSA is sufficient to maintain an almost “continuous” probability density function [4] within its own distribution region set by T/OSR. Overall, though the realization of synchronous RSA is impractical, its concept can help readers to understand the theories behind the asynchronous RSA and VR techniques described in the second half of this chapter.

Based on the definition of the synchronous RSA described above, the theoretical variances with respect to different OSRs can be derived. In the case of OSR = 1, the probability of obtaining a Logic-1 per sample is exactly τ/T = P₁, so the expectation and theoretical variance of each RSA measurement are equal to the results shown in Eqs. (6) and (7), respectively. In the cases of OSR < 1 (i.e., subsampling), for example, when OSR = 1/4, the sampling region of each CK_DCO edge is extended to 4·T, the probability of obtaining a Logic-1 stays the same as τ/T = (4·τ)/(4·T) = P₁ since the positive duty-cycle of CKτ is also extended by 4 times. Overall, whenever OSR ≤ 1, the expectation and theoretical variance of synchronous RSA can be expressed by Eqs. (6) and (7), respectively, on the condition that OSR must be always the reciprocal of a positive integer number to satisfy a constant CKτ duty-cycle within each uniformly distributed sampling region set by T/OSR.

When OSR > 1, each uniformly distributed sampling region is labeled by an integer index k, k ∈ [1, OSR]; there are total OSR sampling outcomes per T, and only one of these sampling outcomes can possibly be Logic-1 or 0, which is the region containing the CKτ transition from high to low. For the example of OSR = 2 and τ/T < 0.5, the CKτ period is equally split into two sampling regions (k = 1 to 2). Only the first region (k = 1) can possibly obtain a Logic-1 or Logic-0 outcome because τ/T < 0.5, and the Logic-1 probability is τ/(T/2) = 2·P₁; note that P₁ = τ/T is the probability of obtaining Logic-1 in the cases of OSR ≤ 1. On the other hand, the outcome of the second region (k = 2) is always Logic-0 since CKτ does not transit in the region of k = 2. The equivalent outcome of each synchronous RSA measurement is the average of k = 1 and k = 2 sampling outcomes. This example shows that the variance of each synchronous RSA, Var[Y¯], is a function of OSR and the index k indicating the region of containing the CKτ transition from high to low. In general, the oversampling process reduces the power of quantization error of the RSA-based TDC because the regions without any CKτ transition designate the coarse information (MSBs) of the CKτ duty-cycle while the region with the CKτ transition provides the fine information (LSB). Thus, a larger value of OSRs confines the uncertain outcome due to the CKτ transition within a finer region and equivalently enhances the resolution of the synchronous RSA measurement similar to the quantization process of an analog-to-digital conversion.

In the cases of OSR ≥ 1, the theoretical variance of the synchronous RSA can be acquired by finding the total covariance sum with the joint PDFs of all samples within each T, so the expectation and theoretical variance of the synchronous RSA measurement can be generalized as follows [6]:

EY¯=limNDCO→∞1NDCO∙∑n=1NDCOYn=limNDCO→∞1NDCOOSR∙∑n=1NDCOOSR∑k=1OSRYkOSR

=limNDCO→∞1NDCOOSR∙∑n=1NDCOOSRYOSR,nE10

k−1OSR<τT=P1=1−P0≤kOSR,k∈1,2,…,OSR

VarY¯=VarQ¯=σY¯2=OSR∙P1−k+1NDCO∙kOSR−P1,kOSR≤12OSR∙P0+k−OSRNDCO∙OSR−k+1OSR−P0,kOSR>12E11

Based on Eq. (11), the correlations induced by the oversampling process reduce the overall variances. That is, evenly splitting each T by OSR and taking the average of all sub-region outcomes to reach the overall result of each synchronous RSA measurement are actually producing negative correlations among all the samples. Note that “k” in Eq. (10) simply represents the index of the summation operator to find Y_OSR,n per T, but “k” in Eq. (11) is a specific integer number within 1 to OSR based upon the transition of CKτ; i.e., for a certain τ/T under the RSA measurement, only a certain k can meet the 1st line of Eq. (11). Also, note that, regardless of the value of OSR, the WLLN is always valid.

4. Asynchronous random sampling-and-averaging

The asynchronous RSA technique can be practically realized by power and area efficient ICs and perform I.I.D. random sampling within the period of CKτ without strict sampling PDF boundaries and frequency relationships between CKτ and CK_DCO. Before explaining how I.I.D. random sampling can be in an asynchronous manner for practical RSA realization, some parameters are defined as follows: (a) T_DCO,n is the n-th period of the DCO. (b) t_SAMP,n is the n-th absolute time at the n-th sampling edge of the DCO. (c) ΔT_PRBS,n is the n-th DCO period dynamically extended by the PRBS generator (T_DCO,n = T_DCO,MIN + ΔT_PRBS,n) between (T_DCO,MIN + 0) and (T_DCO,MIN + ΔT_PRBS,MAX) [7]. Both T_DCO,MIN and ΔT_PRBS,MAX can be coarsely adjusted by their own static controls in the DCO circuit. (d) For the sake of simplicity, CKτ and CK_DCO are assumed to exhibit coincident rising edges at t = 0.

Though naturally the DCO can perform a phase-noise accumulation under the presence of device Johnson (thermal) noise, flicker noise, and power-supply noise, a strong phase-noise accumulation is required to form a widely distributed random sampling PDF. Therefore, a noise-energy dominated PRBS noise source is intentionally added in the DCO for the asynchronous RSA technique. Theoretically, the combinational effect of multiple I.I.D. noise sources are all applicable to the analysis result in this section, but the asynchronous RSA technique relies on artificial PRBS noise to effectively scramble and avoid the synchronicity between CKτ and CK_DCO. Meanwhile, although the circuit/device noise sources exhibit much lower energy than the PRBS can, they offer arbitrarily small phase-noise accumulations to fill the gaps among discrete PRBS noise step sizes. Therefore, the combinational phase-noise accumulations from the circuits/devices and PRBS generator can help CK_DCO to possess an almost continuous sampling PDF.

Under the parameters and noise definitions described above, the n-th absolute sampling time can be represented by Eq. (12) as follows [6]:

tSAMP,n=∑k=1nTDCO,k=∑k=1nTDCO,MIN+∆TPRBS,k=n∙TDCO,MIN+∑k=1n∆TPRBS,kE12

Each t_SAMP,n consists of two components: the deterministic term (n·T_DCO,MIN) and stochastic term created by the phase-noise accumulation, which represents the randomness of each CK_DCO sampling instant with a PDF, f_DCO,n(t), so the occurrence of the n-th CK_DCO sampling edge is a random variable having its own PDF, f_DCO,n(t), with the density magnitude across the distribution span. Also, as shown in Eq. (12), the stochastic term is the accumulation of “n” I.I.D. random variables (ΔT_PRBS,k, k = 1 to n) produced by the PRBS generator over “n” times, so f_DCO,n(t) is the convolution result of “n” uniformly distributed PDFs from the PRBS generator. For example, if n = 1, the stochastic term has one random variable from the PRBS generator with the PDF, f_DCO,1(t), which is the fundamental uniform distribution with a distribution span over ∆T_PRBS,MAX and constant density magnitude of 1/∆T_PRBS,MAX. If n = 2, the stochastic term is the accumulation of two I.I.D. random variables, which both are I.I.D. but sequentially produced by the PRBS generator. Therefore, f_DCO,2(t) is the convolution of the two individual uniformly distributed PDFs because of the Convolution Theorem [4], and it exhibits an isosceles triangular distribution with a distribution span across 2·ΔT_PRBS,MAX and 1/ΔT_PRBS,MAX peak density magnitude. By increasing the of number of samples, the Central Limit Theorem guarantees that the PDF of the stochastic term converges to a Gaussian distribution independent from the PDF of each random variable produced by the PRBS generator. That is, when n ≫ 1, the mean, standard deviation, span, and peak of f_DCO,n(t) are converged to the expressions as follows [6]:

Meann=n∙TDCO,MIN+n2∙∆TPRBS,MAX,Spann=n∙∆TPRBS,MAX

STDn=n12∙∆TPRBS,MAX,Peakn=6π∙n∙1∆TPRBS,MAX

fDCO,nt=ConvfDCO,n−1tfDCO,1t,n>1≈1STDn∙2∙π∙exp−t−Meann22∙STDn2,n≫1E13

Note that the time-domain variable “t” represents absolute time values referenced to t = 0. Since the deterministic term (n·T_DCO,MIN) of t_SAMP,n sets the left distribution bound of f_DCO,n(t), the PDFs of adjacent samples quickly exhibit a large amount of distribution overlaps along with the growth of Span_n. Note that the overlaps among sampling PDFs in the time-domain do not change the monotonic ascending order of the t_SAMP,n occurrences; this just indicates the correlation among the f_DCO,n(t) due to the DCO phase-noise accumulation started from the 1st sample to the n-th sample as expressed in Eq. (12); e.g., t_SAMP,n-1 always occurs before t_SAMP,n although the PDFs of their stochastic term, f_DCO,n-1(t) and f_DCO,n(t), overlap on the top of each other in the absolute time domain.

The DCO sampling PDFs generated by the approach described so far does not seem uniform and independent because: (a) when n ≫ 1, f_DCO,n(t) in Eq. (13) becomes a Gaussian distribution PDF; (b) all the sampling instants, t_SAMP,n, are correlated because of the DCO phase-noise accumulation. These two concerns can be actually resolved by properly setting up the noise-energy (∆T_PRBS,MAX) of the PRBS generator with the modulo-T operation because of the periodicity of CKτ, so the Gaussian distribution and correlated characteristics of the DCO sampling PDF, f_DCO,n(t), can be turned into the uniform distribution and independent characteristic of modulo-T sampling PDFs, f_n(t). The modulo-T operation is automatically achieved when converting Δt into the duty-cycle τ/T of the periodic signal CKτ. The detail of the modulo-T operation is described below.

If n ≫ 1, the Gaussian distribution span of f_DCO,n(t) covers several CKτ periods, and the CKτ period T equivalently slices the entire Gaussian PDF into several pieces in the absolute time-domain. That is, these sliced PDFs maintain their own magnitude distributions but are all bounded within the same distribution span T, and all sample the identical CKτ waveform y_n(t), t ∈ [0, T). In other words, the distributions of all sliced PDFs are strictly confined within a modulo-T time-interval between 0 and T. Thus, the net density-magnitude at any time instant within [0, T) is the linear summation of all sliced PDFs; i.e., the equivalent PDF of the n-th DCO sampling instant f_n(t) is the summation of all sliced PDFs from f_DCO,n(t) [6]:

S=Ceil12∙SpannT,Meann′=ModMeannTfnt≈1STDn∙2∙π∙∑k=−SSexp−t−Meann′−k∙T22∙STDn2,n≫1E14

where the time-domain variable “t” is confined within [0, T); (2·S + 1) is the number of segments set by the Span_n; Mean_n’ and k·T are used to shift these (2·S + 1) segments to the modulo-T time-interval, [0, T). When n ≫ 1, f_n(t) converges to a constant 1/T across the single T span; this fact can be proven by both mathematical calculation of Eq. (14) and statistical simulations. More importantly, by increasing “n”, all sampling PDFs converge to a uniformly distributed PDF with a constant density magnitude 1/T across the [0, T) distribution span, independent from the parameters of T_DCO,MIN, ΔT_PRBS,MAX, and even “n” when n ≫ 1. In other words, for all “n” ≫ 1, f_n(t) becomes an “identically distributed” PDF, which satisfies the “second” criterion of the I.I.D. random variable and can be implemented by low-cost circuitry. This convergence of the uniform distribution also guarantees the convergence of the asynchronous RSA measurement result, which exactly matches to the expectation in Eq. (6), as follows [6]:

EY¯=E1NDCO∙∑n=1NDCOYn=1NDCO∙∑n=1NDCOEYn=EYn=∫0Tynt∙fnt∙dt=∫0τ1T∙dt+∫τT0T∙dt=τT=P1E15

If the value of “n” is not large enough, the modulo-T PDF may behave like a non-uniform distribution; e.g., when ΔT_PRBS,MAX = 0.25·T, f_n(t) does not perform like a uniformly distributed PDF until n = 64. However, these uniformly distributed modulo-T PDFs will not affect the overall asynchronous RSA measurement result shown in Eq. (15) since these non-uniform modulo-T PDFs can occur only when all the RSA experiments have the identical absolute time reference at t = 0 as the assumption for Eq. (12). In realistic hardware implementation, the initial conditions of CKτ and CK_DCO will be inevitably randomized during the sampling process so that f_n(t) for all “n” can actually form uniformly distributed PDFs within [0, T) although the value of “n” is a small number. This effect can be observed by merging a naturally randomized initial condition t_INT into the unit impulse δ(t) expressed in Eq. (16).

Based on the Convolution Theorem [4], f_n(t) can be expressed in the format similar to f_DCO,n(t) in Eq. (13), but note that f_n(t) shall follow the Circular Convolution Theorem [8] due to the combination of the modulo-T operation and linear convolution [6]:

f1t=CConvfDCO,1tδtTfnt=CConvfn−1tf1tT,n>1E16

In Eq. (16), f₁(t), f_n-1(t), and f_n(t) is in the modulo-T time-domain t ∈ [0, T); f_DCO,1(t) and δ(t) is in the absolute time-domain t = [0, ∞). It is important to emphasize that f₁(t) is not only the PDF of the 1st sampling instant but also the elementary PDF to derive f_n(t) from f_n-1(t) based on the Circular Convolution Theorem.

Several important attributes of f₁(t) are summarized as follows: The distribution of f₁(t) constantly starts at the remainder of T_DCO,min/T. Whenever the PDF reaches t = T, it circulates back to t = 0 and then continues its distribution toward t = T. In the first case of ΔT_PRBS,MAX < T, for example, ΔT_PRBS,MAX = 0.25·T, intuitively f₁(t) has non-zero values from a certain “t” to “t + 0.25·T” within [0, T). In the second case of ΔT_PRBS,MAX = T or Mod[ΔT_PRBS,MAX, T] = 0, f₁(t) circulates multiple integer cycles uniformly from a certain “t” within [0, T) and then back to the same “t”. In the third case of ΔT_PRBS,MAX > T and Mod[ΔT_PRBS,MAX, T] ≠ 0, f₁(t) exhibits two non-zero density magnitudes because f_DCO,1(t) circulates within [0, T) multiple times with a non-zero remainder, and the delta between these two density magnitudes is 1/ΔT_PRBS,MAX. The third case indicates f₁(t) can converge to a uniform distribution by itself when ΔT_PRBS,MAX ≫ T. The attributes of f₁(t) seem insignificant since f_n(t) will converge to a uniform distribution anyway; however, although all f_n(t) are identical when n ≫ 1, this elementary PDF, f₁(t), plays an important role from the standpoint of the correlations among all sampling PDFs, f_n(t), n = 1 to N_DCO.

To elaborate the correlations among all sampling PDFs, first of all, the covariance between adjacent samples, Y_n and Y_n+1, can be examined as follows convolution [6]:

CovYnYn+1=∬Ryn−EY∙yn+1−EY∙fynyn+1∙dyn∙dyn+1=∫0Tynt−P1∙yn+1t−P1∙fn,n+1t∙dt=1−P1∙1−P1∙∫0τfn,n+1tYn=1Yn+1=1∙dt+1−P1∙0−P1∙∫τTfn,n+1tYn=1Yn+1=0∙dt+0−P1∙1−P1∙∫0τfn,n+1tYn=0Yn+1=1∙dt+0−P1∙0−P1∙∫τTfn,n+1tYn=0Yn+1=0∙dtE17

where the 1st line of Eq. (17) is based on the fundamental covariance definition of two random variables, Y_n and Y_n+1, on the same sample space, R, with their joint PDF, f(y_n, y_n+1), and PDF variables, y_n and y_n+1. The 2nd line of Eq. (17) represents the covariance in the format of one-dimensional geometric probability with the modulo-T time-domain PDF variables, y_n(t) and y_n+1(t), and joint PDF, f_n,n+1(t). y_n(t) and y_n+1(t) are the same because of the CKτ periodicity, and the possible outcomes of Y_n and Y_n+1 are either Logic-1 or Logic-0 both with the expectation E[Y] = τ/T = P₁ verified in Eq. (15). By taking advantage of simple binary values, Cov[Y_n, Y_n+1] can be expanded into the summation of four conditional covariances based on the total four possible combinations of Y_n and Y_n+1 with their corresponding conditional joint PDFs as shown in the 3rd to 6th lines of Eq. (17). In general, f_n(t) is a uniform distribution for all ΔT_PRBS,MAX scenarios when n ≫ 1, so f_n(t)|Y_n = 1 and f_n(t)|Y_n = 0 (the conditional PDFs) are set by y_n(t) across the modulo-T time interval. To further reach the conditional joint PDFs, f_n,n+1(t)|Y_n, each f_n(t)|Y_n has to circularly convolute with the fundamental PDF element, f₁(t), which is a function of ΔT_PRBS,MAX, so different f₁(t) generate their corresponding f_n,n+1(t)|Y_n = 1 and f_n,n+1(t)|Y_n = 0 as follows [6]:

fn,n+1tYn=1=CConvfntYn=1f1tTfn,n+1tYn=0=CConvfntYn=0f1tTE18

Note that f_n(t) in Eq. (16) and f_n,n+1(t)|Y_n in Eq. (18) are different since f_n(t) is calculated from f_n-1(t) and becomes independent from f₁(t), but f_n,n+1(t)|Y_n is obtained from f_n(t)|Y_n, and f₁(t) determines their correlation. Eventually, the conditional joint PDFs of each scenario are derived from y_n+1(t) across t ∈ [0, T) to thoroughly include all possible conditions of Y_n and Y_n+1.

When Mod[ΔT_PRBS,MAX, T] = 0, the four conditional joint PDFs all maintain constant density magnitudes within their own integral time-intervals, [0, τ) and [τ, T), so the covariance of the adjacent samples can be further derived for this scenario easily [6]:

IfMod∆TPRBS,MAXT=0,

CovYnYn+1=P0∙P0∙P1T∙τ−P0∙P1∙P1T∙T−τE19

−P1∙P0∙P0T∙τ+P1∙P1∙P0T∙T−τ=0

Based on the result of Eq. (19), i.e., a zero covariance between any adjacent samples Y_n and Y_n+1, and the accumulated relation from f₁(t) to f_n(t) shown in Eq. (16), Mod[ΔT_PRBS,MAX, T] = 0 is the necessary condition for all, not just adjacent, f_n(t) to be “independent.” By consolidating the identicality and independency of f_n(t) for all ΔT_PRBS,MAX scenarios, Mod[ΔT_PRBS,MAX, T] = 0 is the fundamental criterion to perform an asynchronous RSA-based TDC with I.I.D. random sampling PDFs. This result matches the requirement of a synchronous RSA-based TDC when OSR = 1 and the CK_DCO sampling PDF is uniformly distributed across one CKτ cycle, T. The main difference between the two is that the asynchronous RSA-based TDC requires more time per sample due to the deterministic offset T_DCO,MIN. However, this deterministic offset per sample offers a margin for practical circuit implementation in the asynchronous RSA-based TDC.

Accurately performing “Mod[ΔT_PRBS,MAX, T] = 0” may increase the cost of asynchronous RSA-based TDC implementation. To resolve this issue, the case of ΔT_PRBS,MAX > T could be considered though the result of Eq. (19) indicates that the non-uniform conditional joint PDFs induce non-zero covariances. These non-uniform conditional joint PDFs are mainly caused by the non-uniform f₁(t); however, the distribution of f₁(t) can be flattened by setting ΔT_PRBS,MAX ≫ T so that all the conditional joint PDFs can be roughly uniform. In other words, satisfying ΔT_PRBS,MAX ≫ T can be easily implemented without the exact relationship between ΔT_PRBS,MAX and T [7, 9], but the downside is to make conversion-rate even slower since T_DCO,MIN is inevitably enlarged with ΔT_PRBS,MAX when ΔT_PRBS,MAX ≫ T. The compromise between the hardware cost and conversion-rate further confirms the necessity of variance-reduction (VR) techniques discussed in the following sections.

5. Self-antithetic variance reduction

The analysis of self-antithetic variance reduction (SAVR) technique in an asynchronous RSA system can be started from formulating the variance of a Monte Carlo estimate in general [10, 11]:

VarY¯=Var∑n=1NDCOYnNDCO=Var∑n=1NDCOYnNDCO2=∑n=1NDCO∑k=1NDCOCovYnYkNDCO2=E∑k=1NDCOCovYnYkNDCO

=∑n=1NDCOVarYnNDCO2+2∙∑n=2NDCO∑k=1n−1CovYnYkNDCO2

=P1∙P0NDCO+2∙∑n=2NDCO∑k=1n−1CovYnYkNDCO2E20

=P1∙P0NDCO=VarYNDCO,if∑n=2NDCO∑k=1n−1CovYnYk=0VarY¯SA<VarYNDCO,if∑n=2NDCO∑k=1n−1CovYnYk<0

In Eq. (20), “n” and “k” are the sample indexes; Cov[Y_n, Y_k] represents the pairwise covariance of any two samples Y_n and Y_k. If n = k, then Cov[Y_n, Y_k] = Var[Y_n] which is the variance of the n-th sample. If n ≠ k, Cov[Y_n, Y_k] = Cov[Y_k, Y_n] which is a symmetric covariance matrix. When Y₁, Y₂, …, and YNDCO are all pairwise independent, then Cov[Y_n, Y_k] = 0 for all n ≠ k in Eq. (20), and the pairwise covariance sum (the 2nd term in the 3rd line of Eq. (20)) is zero, so Eq. (20) becomes the I.I.D. situation as shown in Eq. (7). On the other hand, if there are correlations among the N_DCO samples of the random variable Y, and the pairwise covariance sum of all n ≠ k is negative (the 2nd term in the 3rd line of Eq. (20)), now the variance can be smaller than that of the I.I.D. situation, i.e., Var[Y¯SA] < P₁·P₀/N_DCO, which is the main idea of SAVR. In other words, SAVR intentionally creates non-zero correlations among the samples, i.e., Y₁, Y₂, …, and YNDCO, of the random variable Y, so that the pairwise covariance sum can be negative, and then the overall variance Var[Y¯SA] per RSA measurement can be effectively less than the variance of the I.I.D. situation in Eq. (7).

According to the analysis about the covariances of the adjacent samples, Cov[Y_n, Y_n+1], under three ΔT_PRBS,MAX scenarios (i.e., ΔT_PRBS,MAX <, =, or > T) in Eq. (17), the conclusion shows Cov[Y_n, Y_n+1] can have pronounced non-zero values when Mod[ΔT_PRBS,MAX, T] < T due to the non-uniform conditional joint PDFs within their own integral time-intervals, [0, τ) and [τ, T). In this section, the case of non-zero covariances between adjacent samples is further extended to the deduction of any pairwise correlation between Y_n and Y_k due to the fact of the DCO phase-noise accumulation property reflected by Eqs. (12) and (16). Thus, if the technique can take advantage of these non-zero covariances, Cov[Y_n, Y_k], and assure the pairwise covariance sum is negative, then VR can be successfully performed.

To find Cov[Y_n, Y_k], all conditional joint PDFs of any [Y_n, Y_k] pair have to be first formulated, where n = 1 to (N_DCO − 1) and k = (n+1) to N_DCO are sufficient to cover all covariance elements in the symmetric covariance matrix. Though there are only four possible binary combinational outcomes of a certain [Y_n, Y_k] pair, its conditional joint PDFs shall also cover all possible binary combinational outcomes from Y_n+1 to Y_k−1 because the accumulation property and Convolution Theorem described in Eq. (16) form the chain of correlations from each specific Y_n through Y_n+1, Y_n+2, …, Y_k−2, and Y_k−1 all the way to each specific Y_k. Thus, the conditional joint PDFs of a [Y_n, Y_k] pair under all possible conditions of [Y_n, …, Y_k−1] shall be generalized as follows [10]:

fn,ktYn…Yk−1=D2Bq=CConvfn,k−1tYn…Yk−1=D2Bqf1tTE21

where q = 0, 1, 2, …, (2^k − n – 1); all possible binary combinational conditions of [Y_n, …, Y_k−1] are represented by the corresponding decimal numbers, “q”, and decimal-to-binary operators, D2B[·], for the sake of simplicity. Since the location of “τ” within the modulo-T time-interval, [0, T), defines the PDF boundaries for the probability of Y_k to be Logic-1 or Logic-0, the conditional joint PDF of a [Y_n, Y_k] pair under all possible conditions of [Y_n, …, Y_k] can be obtained by forcing one part, i.e., [0, τ) or [τ, T), of the PDF in Eq. (21) to zero to satisfy one of the possible Y_k conditions [10]:

fn,ktYn…Yk−1=D2BqYk=1=fn,ktYn…Yk−1=D2Bq,0≤t<τ0,τ≤t<TE22

fn,ktYn…Yk−1=D2BqYk=0=0,0≤t<τfn,ktYn…Yk−1=D2Bq,τ≤t<TE23

In other words, each conditional joint PDF under the conditions of [Y_n, …, Y_k−1] in Eq. (21) can diversify into two conditional joint PDFs under the conditions of [Y_n, …, Y_k] as shown in Eqs. (22) and (23), respectively. Note that the distribution profiles of all conditional joint PDFs are convoluted functions of τ and f₁(t), and f₁(t) itself is a function of Mod[T_DCO,MIN, T] and ΔT_PRBS,MAX. Therefore, the joint PDFs can be very different if the values of τ, Mod[T_DCO,MIN, T], and ΔT_PRBS,MAX are set differently; i.e., these parameters dominantly affect the behaviors of the pairwise covariances, Cov[Y_n, Y_k], which now can be generalized as follows based on the information of all conditional joint PDF of a [Y_n, Y_k] pair from Eqs. (22) and (23) [10]:

CovYnYk=∫0Tynt−P1∙ykt−P1∙fn,kt∙dt

=1−P12∙∑q=02k−n−1−1∫0τfn,ktYn+1…Yk−1=D2BqYnYk=11dt

−P1−P12∙∑q=02k−n−1−1∫τTfn,ktYn+1…Yk−1=D2BqYnYk=10dtE24

−P1−P12∙∑q=02k−n−1−1∫0τfn,ktYn+1…Yk−1=D2BqYnYk=01dt

+0−P12∙∑q=02k−n−1−1∫τTfn,ktYn+1…Yk−1=D2BqYnYk=00dt

As shown in Eq. (24), Cov[Y_n, Y_k] can be grouped into four terms based on the binary combinational outcomes of a [Y_n, Y_k] pair, and each term is the summation of 2^k−n−1 integrals of the conditional joint PDFs under all possible conditions of [Y_n+1, …, Y_k−1]. Thus, the total number of summation terms or conditional joint PDFs in Eq. (24) is 2²·2^k−n−1 = 2^k−n+1 for each Cov[Y_n, Y_k]. In addition, each conditional joint PDF, fn,ktYn…Yk−1, in Eq. (24) is obtained by (k − n) times of the modulo-T circular convolutions. Overall, the computation effort of Eq. (24) is dominated by total (k − n)·2^k−n+1 modulo-T circular convolutions for each Cov[Y_n, Y_k]. On top of this, an overall variance, Var[Y¯], calculation, requires total (NDCO2 − NDCO)/2 of Cov[Y_n, Y_k] elements in the whole covariance summation as shown in 3rd line of Eq. (20), and usually N_DCO is in the range of 2⁸ to 2¹⁸ even with the VR techniques. In short, the computation effort of the theoretical variance with non-zero pairwise covariance formulated by Eq. (24) turns out to be impractical for the performance estimation of the RSA with SAVR technique. This issue reflects the necessity of developing a computation-efficient approach for the overall variance calculation.

To simplify the analysis process, the mechanism of SAVR is elaborated by specific examples in an asynchronous RSA system, and then general cases can be further summarized. The following two examples have common parameter setups: the time-domain quantity under the RSA measurements, τ/T = 0.5 (= P₁), N_DCO = 256, and Mod[T_DCO,MIN, T] ≈ T/2. The only difference is that their ΔT_PRBS,MAX are set to T/8 and T/16 individually through the static PRBS energy-level control. The correlation function between Y_n and Y_k can be observed by plotting Cov[Y_n, Y_k] of each “n” across all possible “k” (= 1 to N_DCO) with N_EXP = 2¹³. For the purpose of reference, note that the correlation functions of I.I.D. scenarios (i.e., ΔT_PRBS,MAX = T) verified in Eq. (19) are basically zero for all n ≠ k and have the non-zero value, 0.25 (= P₁·P₀), at n = k. From these examples of ΔT_PRBS,MAX < T, multiple attributes can be observed: (a) Cov[Y_n, Y_k] of each “n” is self-symmetrical with respect to “k = n”. (b) Because of the symmetric covariance matrix and Cov[Y_n, Y_k] = Var[Y_n] = P₁·P₀ when k = n, all correlation functions have the same profile but shift along the k-axis based on the value of “n”, and therefore Cov[Y_n, Y_k] = Cov[YNDCO−n+1, YNDCO−k+1]. (c) The correlation is enhanced with decreasing ΔT_PRBS,MAX. (d) The correlation degrades with increasing |k − n|. (e) More importantly, the Cov[Y_n, Y_k] of each “n” has a consistent sign-alternation pattern within each correlation-envelop period which guarantees the cancelation in the total covariance sum, i.e., effective VR, though the amplitude of the correlation-envelop attenuates with |k − n|. In sum, the theoretical variance calculation can be greatly simplified by grouping the effect of covariance cancelation into single or multiple least-common-multiple periods (LCMP) of the sign-alternation patterns and correlation-envelop periods instead of considering all the covariance elements.

The effect of this SAVR technique can be more comprehensively demonstrated by the single-dimensional covariance sums and the overall (two-dimensional) covariance sum Var[Y¯SA]. The single-dimensional covariance sum is defined as the summation of all Cov[Y_n, Y_k] from k = 1 to N_DCO with respect to each “n”. The single-dimensional covariance sums of the I.I.D. examples are always 0.25 for all “n” since each “n” only has its non-zero covariance Var[Y_n] = 0.25 at n = k. On the other hand, the single-dimensional covariance sums of the two correlated examples (ΔT_PRBS,MAX = T/8 and T/16) are all smaller than 0.25 across all “n” because the periodical sign-alternation patterns of their correlation functions are canceling the power of Var[Y_n] = 0.25 at n = k. This is the reason for naming this technique as self-antithetic VR. Finally, the overall variance of RSA can be obtained by the summation of all single-dimensional covariance sums divided by NDCO2 as shown at the end of 1st line of Eq. (20). The overall variances, Var[Y¯SA], are reduced from the value of the I.I.D. variance. Equivalently, the first term in the 3rd line of Eq. (20), i.e., P₁·P₀/N_DCO, equals the I.I.D. variance without any SAVR, and the second term in the 3rd line of Eq. (20) is negative to successfully perform SAVR.

The analysis results so far for SAVR offer further insights: (a) A smaller ΔT_PRBS,MAX, i.e., a narrower fundamental sampling PDF, f₁(t), creates stronger correlations across all sampling points (i.e., longer correlation tails), longer LCMPs, and eventually more variance reduction. (b) Though the sign of each single-dimensional covariance sum can be either positive or negative with VR, the average of all single-dimensional covariance sums is always above zero, which verifies the principles of non-negative variances and finite measurement resolutions. (c) More importantly, the computation efficiency of each theoretical single-dimensional covariance sum and overall variance can be greatly improved by only summing the covariances surrounding the “k = n” within a few LCMPs [10]:

If∆TPRBS,MAX=2∙TLCMP&r∙LCMP+22≤n≤2∙NDCO−r∙LCMP+22E∑k=1NDCOCovYnYk≈∑k=n−r∙LCMP2n+r∙LCMP2−1CovYnYkE25

where “r” is the number of LCMP included in the approximation of Eq. (25). Although the accurate one-dimensional covariance sums and partial one-dimensional covariance sums formulated by the right-hand side of Eq. (25) with r = 1 and k = (n − LCMP/2) to (n + LCMP/2–1) at each “n” contain certain amounts of delta, the distribution of the partial one-dimensional covariance sums is relatively constant and basically the average of the accurate one-dimensional covariance sums except the values of “n” approach 1 or N_DCO since the correlation functions of these “n” have significantly unbalanced correlation-tail lengths. Therefore, any partial single-dimensional covariance sum at any “n” far away from 1 or N_DCO is sufficient for approximating the average of the accurate one-dimensional covariance sums formulated by the left-hand-side of Eq. (25). For instance, by plugging Eq. (25) into the end of the 1st line of Eq. (20) with “n = N_DCO/2”, the overall variance can be approximated as follows [10]:

VarY¯=E∑k=1NDCOCovYnYkNDCO≈1NDCO∙∑k=NDCO−r∙LCMP2NDCO+r∙LCMP2−1CovYNDCO2YkE26

Note that the Cov[Y_n, Y_k] in Eqs. (25) and (26) is referring to its theoretical definition in Eq. (24), and the approximation errors in Eqs. (25) and (26) can be improved by extending the number of LCMP, i.e., “r”, included in the summation operators.

As discussed earlier, if the theoretical computation effort is dominated by the modulo-T circular convolutions, the computation efficiency improvement from Eq. (20) to Eq. (26) can be evaluated by the ratios between their operation numbers of modulo-T circular convolutions. With the fact of the correlation tails being roughly vanished when |k − n| > N_DCO/2, the required number of modulo-T circular convolutions in Eq. (20) can be expressed in the numerator of Eq. (27). On the other hand, for Eq. (26), only the covariance elements within one LCMP need to be calculated, and the required number of modulo-T circular convolutions in Eq. (26) can be expressed in the denominator of Eq. (27) [10].

∑k=NDCO2+1NDCO−1k−NDCO2∙2k−NDCO2+1∑k=NDCO2+1NDCO2+LCMPk−NDCO2∙2k−NDCO2+1E27

The ratio of modulo-T circular convolution numbers between Eqs. (20) and (26) under this specific example (i.e., Mod[T_DCO,MIN, T] ≈ T/2 and ΔT_PRBS,MAX ≈ T/8) is about 2.18 × 10³⁴. This incredible “computation effort reduction” (NOT variance reduction), even just for N_DCO = 2⁸ and one LCMP included, is again mainly because all pairwise covariances theoretically have to be included in Eq. (20) due to the inevitable correlations (based on the DCO phase-noise accumulation property and Convolution Theorem) among the majority of N_DCO samples, but actually only the average of all the covariances matters and can be approximated by summing the theoretical covariances within a few LCMPs of the central correlation function as shown in Eq. (26). In sum, the theoretical calculation results and the statistical asynchronous RSA simulation results match well, but the computation efficiency of the theoretical variance with SAVR enabled is actually quite low if relying on the equations from Eqs. (20)-(24). On the other hand, Eq. (26) brings the computation efficiency to a reasonable level but losing the accuracy. Only the Monte Carlo approach from the statistical software/lab experiments can simultaneously offer the efficiency and accuracy from the analytical point of view [10].

6. Control-variate variance reduction

The control-variate variance reduction (CVVR) [3] employs the information of errors in estimates of pre-set quantities to reduce the variance in an estimate of an quantity under the asynchronous RSA measurement. Note that SAVR and CVVR in this chapter both reduce the power of quantization error by creating correlations among the samples of each RSA measurement. The difference is that the correlations of SAVR exist among the samples of a random variable Y, i.e., [Y₁, Y₂, …, YNDCO] itself, which are auto-correlations [5], so the overall variance of the RSA with SAVR is quantified by an auto-covariance sum as expressed in Eq. (20); meanwhile, the correlations of CVVR exist among the samples of at least two random variables, Y and Y_REF, i.e., [Y₁, Y₂, …, YNDCO] and [Y_REF,1, Y_REF,2, …, YREF,NDCO], which are cross-correlations [5], so the overall variance of the RSA with CVVR is quantified by a cross-covariance.

The theoretical analysis of CVVR can start with the case where Y and Y_REF are individual I.I.D. random variables; i.e., no auto-correlation, and only cross-correlation is considered in this section. Eq. (28) shows the process of sampling CKτ and CKτ_REF by CK_DCO simultaneously to generate a cross-correlation between Y and Y_REF only at the sampling instants of “n = k” because Y and Y_REF are individual I.I.D. random variables [11]:

CovYnYREF,k=EYn·YREF,k−EYn∙EYREF,k=CovYYREF≠0,ifn=k0,ifn≠kE28

In Eq. (28), both “n” and “k” = 1 to N_DCO. Furthermore, the cross-covariance between the outcomes of the parallel asynchronous RSA processes, Y¯ and Y¯REF, can be expressed as follows [11]:

CovY¯Y¯REF=EY¯∙Y¯REF−EY¯∙EY¯REF=∑n=1NDCOEYn∙YREF,nNDCO2+2∙∑n=2NDCO∑k=1n−1EYn∙YREF,kNDCO2−EY·EYREFE29

Since Y and Y_REF are individual I.I.D. random variables, the cross-covariance can be simplified based on Eq. (28) as follows [11]:

CovY¯Y¯REF=NDCO∙EY∙YREFNDCO2+2∙∑n=2NDCO∑k=1n−1EYn∙EYREF,kNDCO2−EY·EYREF

=EY∙YREF−EY∙EYREFNDCO=CovYYREFNDCOE30

As shown in Eq. (30), sampling CKτ and CKτ_REF by CK_DCO simultaneously can create a cross-correlation between Y¯ and Y¯REF. Because of the same rising edges, clock period, and random sampling PDFs, f_n(t), the only difference between the waveforms of CKτ and CKτ_REF is their duty-cycles. This duty-cycle delta is the factor causing the different outcomes between the data sequences of Y and Y_REF without considering any mismatch between the sampling DFFs. By plugging the expectations of Y, Y_REF, and (Y·Y_REF) into Eq. (30), the cross-covariance becomes [11]:

CovY¯Y¯REF=CovYYREFNDCO=σY∙σYREF∙ρY,YREFNDCO=σY¯∙σY¯REF∙ρY¯,Y¯REF

=P1−P1∙P1,REFNDCO=P1∙P0,REFNDCO,ifτ≤τREFP1,REF−P1∙P1,REFNDCO=P0∙P1,REFNDCO,ifτ>τREFE31

where σY, σYREF, σY¯, and σY¯REF are standard deviations of Y, Y_REF, Y¯, and Y¯REF, respectively; ρY,YREF is the correlation coefficient [4] between Y and Y_REF, and ρY¯,Y¯REF between Y¯ and Y¯REF. Since Y and Y_REF are individually I.I.D., their standard deviations are the square roots of their variances on the I.I.D. condition shown in Eq. (20). Therefore, the correlation coefficients can be found as follows based on Eqs. (20) and (31) [11]:

ρY¯,Y¯REF=ρY,YREF=P1∙P0,REFP0∙P1,REF,ifτ≤τREFP0∙P1,REFP1∙P0,REF,ifτ>τREFE32

According to Eq. (32), the correlation coefficients are functions of τ and τ_REF since P₁ = 1 − P₀ = τ/T and P_1,REF = 1 − P_0,REF = τ_REF/T according to Eq. (6), so the degree of the cross-correlation between Y and Y_REF is determined by the amount of overlap between the waveforms of the CKτ and CKτ_REF, which is proven by Eq. (32).

Once confirming the cross-correlation between Y and Y_REF can be implemented by the random sampling process, the Monte Carlo estimate of Y_REF is obtained by the regular asynchronous RSA process to measure the duty-cycle of CKτ_REF (τ_REF/T) with a relatively high accuracy so that ÊY¯REF with Var[Y¯REF] becomes pre-set and “known” information before the realistic measurement of “unknown” random variable Y with a targeted accuracy (Y¯ with Var[Y¯]). Because obtaining ÊY¯REF with Var[Y¯REF] has to be done before measuring Y¯ with Var[Y¯], the time consumption due to pre-measuring ÊY¯REF is under the cost of instrument initialization and calibration procedure, which shall not be counted as a part of time-consumption under each RSA-based TDC process. This is essentially how CVVR can improve the conversion-rate. Note that, instead of Y¯REF, the Monte Carlo estimate of Y_REF is denoted by ÊY¯REF to represent its extremely high accuracy obtained from the pre-measuring process.

With the information of ÊY¯REF and the cross-correlation between Y and Y_REF, the samples of a new random variable Y_CV having the identical expectation of Y but a lower variance because of CVVR can be expressed as follows based on the relation between Y_REF,n and Y_n [3]:

YCV,n=Yn−μCV∙YREF,n−ÊY¯REFE33

EYCV=EY−μCV∙EYREF+μCV∙EÊY¯REF=EY=τT=P1E34

VarYCV=VarY−μCV∙YREF−ÊY¯REF

≈VarY−μCV∙YREF+μCV∙EYREFE35

=VarY+μCV2∙VarYREF−2∙μCV∙CovYYREF

where Y_CV,n is the variance-reduced version of Y_n per sample; the error term, (Y_REF,n - ÊY¯REF), serves as a “control” per sample to produce Y_CV,n in estimating E[Y] = E[Y_CV] as shown in Eq. (34); μ_CV represents the VR coefficient. To comprehend how the effective VR can be reached by Eq. (33), the fully correlated situation (Y_n = Y_REF,n for all “n”) between Y and Y_REF can be quickly examined; i.e., when μ_CV = 1, Eq. (33) leads to Y_CV,n = ÊY¯REF for all “n”, so even a single sampled outcome Y_CV,1 can reach the accuracy of ÊY¯REF without a large number of samples. The degree of correlation between Y and Y_REF significantly affects the amount of VR and accordingly reduce the number of the required samples N_DCO to reach the target accuracy, and note that the achievable accuracy is dominated by Var[Y¯REF] during the pre-measuring process. That is, Var[Y_CV] is smaller than Var[Y] because the summation of the 2nd and 3rd terms at the end of Eq. (35) is negative and a function of Cov[Y, Y_REF]. In sum, the benefit of CVVR mainly relies on the pre-measuring process: (a) If the pre-measuring process offers more time or accuracy, the more VR or conversion-rate can be reached. (b) If Y_REF and Y have a higher correlation, the more VR or conversion-rate can be obtained as well.

There are two major concerns in the IC implementation: additional circuit power/area consumption and achievable degree of correlation. Actually, realizing Y_CV,n in Eq. (33) requires a high speed/resolution digital operation per sample; this can consume a certain amount of power/area because of two reasons: (a) The pre-measured ÊY¯REF and μ_CV both possess high resolutions to meet the targeted accuracy per RSA process. (b) Though ÊY¯REF and μ_CV are constant, the digital operation of Eq. (33) is executed in the speed of the DCO frequency to produce the high-resolution Y_CV,n per sample. The solution for these two concerns is based on the fact that the final outcome of each RSA process is the mean value of the sampled random variable Y¯CV. Therefore, the averaging process can be incorporated into Eq. (33) to perform the RSA with CVVR technique [11]:

Y¯CV=∑n=1NDCOYn−μCV∙YREF,n−ÊY¯REFNDCO=Y¯−μCV∙Y¯REF−ÊY¯REF

=NYNDCO−μCV∙NYREFNDCO−ÊY¯REFE36

EY¯CV=EY¯−μCV∙EY¯REF+μCV∙EÊY¯REF

=EY¯=EY=τT=P1E37

VarY¯CV=VarY¯−μCV∙Y¯REF−ÊY¯REF

≈VarY¯+μCV2∙VarY¯REF−2∙μCV∙CovY¯Y¯REFE38

=VarYNDCO+μCV2∙VarYREFNDCO−2∙μCV∙CovYYREFNDCO

Compared to Eqs. (33)-(35), respectively, Eqs. (36)-(38) represent a power/area efficient IC implementation of the RSA with CVVR technique: (a) Instead of finding every high-resolution Y_CV,n in Eq. (33) and then averaging N_DCO samples, Eq. (36) shows that Y¯CV can be obtained by a one-time high-resolution calculation after the parallel RSA processes of Y¯REF (= NYREF/N_DCO) and Y¯ (= N_Y/N_DCO) are finished, so the high-speed random sampling-and-averaging circuits all stay in the 1-bit digital operations. (b) The expectation in Eq. (37) proves that the convergency of Y¯CV is independent of the CVVR simplification from Eqs. (33)-(36). (c) The variance of the RSA with CVVR technique in Eq. (38) still follows the WLLN.

With Eq. (36) and pre-measured ÊY¯REF ready, the optimal value of the VR coefficient μ_CV, which can minimize the variance of Y¯CV shown in Eq. (38), is the last item for realizing the RSA with CVVR technique [11]:

∂VarY¯CV∂μCV=2∙μCV∙VarY¯REF−2∙CovY¯Y¯REF=0

μCV=CovY¯Y¯REFVarY¯REF=σY¯σY¯REF∙ρY¯,Y¯REFE39

By plugging Eq. (39) into (38), the minimum variance of the RSA with CVVR technique is shown as follows [11]:

VarY¯CV≈VarY¯−Cov2Y¯Y¯REFVarY¯REFE40

Clearly, the amount of variance reduction (the 2nd term of Eq. (40)) from the variance of an I.I.D. case (the 1st term of Eq. (40)) is mainly determined by the cross-correlation between Y¯ and Y¯REF. Moreover, the conclusion shown in Eq. (40) can be further quantified by the gain of CVVR or conversion-rate, G_CV, of this technique, which is represented by the ratio between the RSA variances without and with CVVR [11]:

GCV=VarY¯VarY¯CV=VarY¯∙VarY¯REFVarY¯∙VarY¯REF−Cov2Y¯Y¯REF

=σY¯2∙σY¯REF2σY¯2∙σY¯REF2−σY¯2∙σY¯REF2∙ρY¯,Y¯REF2=11−ρY¯,Y¯REF2E41

Based on Eqs. (39)-(41), several attributes and implementation methodologies of CVVR are discussed as follows: (a) The gain of CVVR, G_CV, is a function of the cross-correlation between the final outcomes of the parallel RSA processes, Y¯ and Y¯REF. For instance, when ρY¯,Y¯REF = 0 = μ_CV, G_CV = 1, and then Var[Y¯CV] = Var[Y¯]; when ρY¯,Y¯REF = 1 = μ_CV, G_CV = ∞, and then Var[Y¯CV] ≈ 0 based on Eq. (40). (b) Different from the variance of SAVR in Eq. (20) requiring the auto-covariance sum of all Cov[Y_n, Y_k] to be negative when n ≠ k, the cross-covariance Cov[Y¯, Y¯REF] of CVVR can have a positive or negative sign; either way can produce effective VR instead of variance-addition (VA) since, as shown in Eq. (40), Var[Y¯CV] is always smaller than or equal to Var[Y¯] regardless of the sign of Cov[Y¯, Y¯REF], which is canceled by itself embedded in μ_CV as shown in Eq. (39). This fact reflects the outstanding stability of CVVR than that of SAVR. (c) CVVR scales the variance inversely proportional to G_CV based on Eq. (41). That is, if a certain amount of VR can be reached by enlarging N_DCO, then the same amount of VR can be also reached by CVVR without enlarging N_DCO. Therefore, G_CV can be used to represent the gain of conversion-rate enhancement or measurement-time (∝ N_DCO·T_DCO,AVG) reduction offered by CVVR under a certain requirement of ENOB.

The attributes mentioned above are all under the assumption of knowing optimal the value of μ_CV. However, based on Eqs. (30) and (39) regarding μ_CV and Cov[Y¯, Y¯REF], if E[Y], which is the ideal result of each RSA measurement, is unknown, then it seems unlikely to find the optimal VR coefficient. The good news is that the Monte Carlo method [3] is also applicable to finding an estimate of μ_CV by using the estimates from the RSA processes under a finite sampling number N_DCO instead of using the expectations, E[Y] and E[Y_REF]. Therefore, because Y and Y_REF are individual I.I.D. random variables based on the assumption of this section, the estimate VR coefficient μ̂CV can be obtained by [11]:

μCV=CovY¯Y¯REFVarY¯REF=CovYYREFVarYREF=EY−EY∙YREF−EYREFEYREF−EYREF2

≈∑n=1NDCOYn−Y¯∙YREF,n−Y¯REF∑n=1NDCOYREF,n−Y¯REF2=μ̂CVE42

Though Eq. (42) eliminates the necessity of ideal expectations, it requires hardware to store the entire data sequences of Y_n and Y_REF,n with N_DCO samples until the parallel RSA process outcomes, Y¯ and Y¯REF, are completed and ready for the operation of Eq. (42) and then Eq. (36). To minimize IC power/area consumptions, μ̂CV in Eq. (42) can be further derived to become Eq. (43) demonstrating significant hardware simplification in the IC implementation for finding μ̂CV. Eq. (43) essentially performs a one-time digital operation with N_Y, NYREF, NY∙YREF and N_DCO from the 1-bit accumulators after the completion of each RSA-based TDC with CVVR technique [11].

μ̂CV=∑n=1NDCOYn∙YREF,n−Yn∙Y¯REF−Y¯∙YREF,n+Y¯∙Y¯REF∑n=1NDCOYREF,n2−2∙YREF,n∙Y¯REF+Y¯REF2

=∑n=1NDCOYn∙YREF,n−NDCO∙Y¯∙Y¯REF∑n=1NDCOYREF,n−2∙Y¯REF∙NDCO∙Y¯REF+NDCO∙Y¯REF2

=∑n=1NDCOYn∙YREF,nNDCO−Y¯∙Y¯REFY¯REF−Y¯REF2=NDCO∙NY∙YREF−NY∙NYREFNDCO−NYREF∙NYREFE43

7. Conclusion

The evolution of the four RSA techniques has been reviewed in this chapter, including synchronous RSA, asynchronous RSA, asynchronous RSA with SAVR, and asynchronous RSA with CVVR, whose theoretical expectations, variances, figure of merits, and parameter settings of their IC implementations are listed in Table 1. The main goal of this chapter is to summarize the concept/algorithm of the RSA with VR techniques and introduce the unified RSA-based TDC architecture for both high-speed and high-resolution TCSPC applications.

Technology	22-nm CMOS	22-nm CMOS	22-nm CMOS
Technique	Asyn. RSA	Asyn. RSA w/SAVR	Asyn. RSA w/CVVR
DCO Power (mW)	3	3.1	3
Sampling Frequency (GS/s)	4	7.8	4
# of Sampling Phases	8	8	8
Dynamic Range (ns)	10 ∼ 1000	10 ∼ 1000	10 ∼ 1000
ENOB	12 @N_DCO = 2²⁴ 14 @N_DCO = 2²⁸	12 @N_DCO = 2¹⁹ 14 @N_DCO = 2²³	12 @N_DCO = 2^20.9 14 @N_DCO = 2^24.9
Effective Resolution (ps @14 ENOB)	0.61 ∼ 61	0.61 ∼ 61	0.61 ∼ 61
Conversion-Rate (kHz @12 ENOB)	2	120	16
TDC Power (mW)	1.3	1.5	1.9
TDC FoM (pJ/step @12 ENOB)	159	3.1	29
DCO + TDC Power Ratio	1	1.07	1.14
Conversion-Rate Ratio	1	60	8
TDC Area (mm²)	0.01	0.01	0.018
Digital Filter Power (mW)	0.45	0.90	0.91
Multi-bit Digital Operations per RSA-based TDC	Eq. (6)	Eq. (6)	Eq. (36) and (43)
Theoretical Expectation	Eq. (15)	Eq. (6)	Eq. (37)
Theoretical Variance	Eq. (7)	Eq. (20)	Eq. (40)
Circuit Parameters @T = 250 ps, T_DCO,MIN = T/2	ΔT_PRBS,MAX = T	ΔT_PRBS,MAX = T/32	ΔT_PRBS,MAX = T, τ_REF LSB = T/16

Table 1.

Comparison of RSA-based time-to-digital conversion techniques.

References

1. Becker W. Advanced Time-Correlated Single Photon Counting Techniques, Berlin. Germany: Springer; 2005
2. Becker W. The bh TCSPC Handbook. 7th ed. Berlin, Germany: Becker & Hickl GmbH; 2017
3. Glasserman P. Monte Carlo Methods in Financial Engineering. New York, NY, USA: Springer; 2003. DOI: 10.1007/978-0-387-21617-1
4. Ghahramani S. Fundamentals of Probability. Upper Saddle River, NJ, USA: Prentice-Hall; 1996
5. Haykin S. Communication Systems. 4th ed. New York, NY, USA: Wiley; 2001
6. Wu T, Yang R, Hsueh T-C. Random sampling-and-averaging techniques for single-photon arrival-time detections in quantum applications: theoretical analysis and realization methodology. IEEE Transactions on Circuits and Systems I: Regular Papers. 2022;69(4):1452-1465. DOI: 10.1109/TCSI.2021.3135833
7. Hsueh T-C, O’Mahony F, Mansuri M, Casper B. An on-die all-digital power supply noise analyzer with enhanced spectrum measurements. IEEE Journal of Solid-State Circuits. 2015;50(7):1711-1721. DOI: 10.1109/JSSC.2015.2431071
8. Oppenheim AV, Schafer RW, Buck JR. Discrete-Time Signal Processing. 2nd ed. Upper Saddle River, NJ, USA: Prentice-Hall; 1999
9. Hsueh T-C, Balamurugan G, Jaussi J, Hyvonen S, Kennedy J, Keskin G, et al. A 25.6Gb/s differential and DDR4/GDDR5 dual-mode transmitter with digital clock calibration in 22nm CMOS. In: IEEE ISSCC Digital Technical Papers. San Francisco, CA, USA: IEEE; 2014. pp. 444-445. DOI: 10.1109/ISSCC.2014.6757506
10. Wu T, Hsueh T-C. A high-resolution single-photon arrival-time measurement with self-antithetic variance reduction in quantum applications: Theoretical analysis and performance estimation. IEEE Transactions on Quantum Engineering. 2022;3:1-15. DOI: 10.1109/TQE.2022.3209211
11. Yang R, Wu T, Hsueh T-C. A high-accuracy single-photon time-interval measurement in mega-Hz detection rates with collaborative variance reduction: theoretical analysis and realization methodology. IEEE Transactions on Circuits and Systems I: Regular Papers. 2023;70(1):176-189. DOI: 10.1109/TCSI.2022.3206406

[1] 1. Becker W. Advanced Time-Correlated Single Photon Counting Techniques, Berlin. Germany: Springer; 2005

[2] 2. Becker W. The bh TCSPC Handbook. 7th ed. Berlin, Germany: Becker & Hickl GmbH; 2017

[3] 3. Glasserman P. Monte Carlo Methods in Financial Engineering. New York, NY, USA: Springer; 2003. DOI: 10.1007/978-0-387-21617-1

[4] 4. Ghahramani S. Fundamentals of Probability. Upper Saddle River, NJ, USA: Prentice-Hall; 1996

[5] 5. Haykin S. Communication Systems. 4th ed. New York, NY, USA: Wiley; 2001

[6] 6. Wu T, Yang R, Hsueh T-C. Random sampling-and-averaging techniques for single-photon arrival-time detections in quantum applications: theoretical analysis and realization methodology. IEEE Transactions on Circuits and Systems I: Regular Papers. 2022;69(4):1452-1465. DOI: 10.1109/TCSI.2021.3135833

[7] 7. Hsueh T-C, O’Mahony F, Mansuri M, Casper B. An on-die all-digital power supply noise analyzer with enhanced spectrum measurements. IEEE Journal of Solid-State Circuits. 2015;50(7):1711-1721. DOI: 10.1109/JSSC.2015.2431071

[8] 8. Oppenheim AV, Schafer RW, Buck JR. Discrete-Time Signal Processing. 2nd ed. Upper Saddle River, NJ, USA: Prentice-Hall; 1999

[9] 9. Hsueh T-C, Balamurugan G, Jaussi J, Hyvonen S, Kennedy J, Keskin G, et al. A 25.6Gb/s differential and DDR4/GDDR5 dual-mode transmitter with digital clock calibration in 22nm CMOS. In: IEEE ISSCC Digital Technical Papers. San Francisco, CA, USA: IEEE; 2014. pp. 444-445. DOI: 10.1109/ISSCC.2014.6757506

[10] 10. Wu T, Hsueh T-C. A high-resolution single-photon arrival-time measurement with self-antithetic variance reduction in quantum applications: Theoretical analysis and performance estimation. IEEE Transactions on Quantum Engineering. 2022;3:1-15. DOI: 10.1109/TQE.2022.3209211

[11] 11. Yang R, Wu T, Hsueh T-C. A high-accuracy single-photon time-interval measurement in mega-Hz detection rates with collaborative variance reduction: theoretical analysis and realization methodology. IEEE Transactions on Circuits and Systems I: Regular Papers. 2023;70(1):176-189. DOI: 10.1109/TCSI.2022.3206406