## 1. Introduction

Optical detection involves converting an optical signal into an electrical signal. Most optical detectors are operated in a linear mode, i.e. the output signal is proportional to the incident photon flux. The main limitation in sensitivity of these linear detectors is the ability to extract a small signal from the amplifier noise in a given bandwidth. For example, avalanche photodiodes (APDs) commonly used in fiber-based optical communications have sensitivities of few hundreds of photons in a 100 ps detection window. When higher sensitivities are needed, single-photon detectors (SPDs) are often used, which operate in a strongly nonlinear mode. Indeed, a pulse containing more than one photon produces the same output signal as a single-photon pulse, which implies that it is not possible to directly measure the number *n* of photons in a pulse, if the pulse duration is smaller than the detector response time. However, photon number resolving (PNR) detectors are important in quantum communication, quantum information processing and quantum optics for two class of applications. In one case PNR detectors are needed to reconstruct the incoming photon number statistics by ensemble measurements. This is the case of the characterization of nonclassical light sources such as single photon (Yuan et al. 2002) or *n*-photon (Waks et al. 2004) state generators or of the detection of pulse splitting attacks in quantum cryptography (Brassard et al. 2000). In the second case PNR detectors are needed to perform a single-shot measurement of the photon number. Applications of this kind are linear-optics quantum computing (Knill et al. 2001), long distance quantum communication (which requires quantum repeaters (Sangouard et al. 2007)) and conditional-state preparation (Sliwa & Banaszek 2003). Moreover, a linear detector with single-photon sensitivity can also be used for measuring a temporal waveform at extremely low light levels, e.g. in long-distance optical communications, fluorescence spectroscopy, and optical time-domain reflectometry.

Among the approaches proposed so far to PNR detection, detectors based on charge-integration or field-effect transistors (Fujiwara & Sasaki 2007, Gansen et al. 2007; Kardynal et al. 2007) are affected by long integration times, leading to bandwidths <1 MHz. Transition edge sensors (TES (Lita et al. 2008)) show extremely high (95%) detection efficiencies but they operate at 100 mK and show long response times (several hundreds of nanoseconds in the best case). Approaches based on photomultipliers (PMTs) (Zambra et al. 2004) and APDs, such as the visible light photon counter (VLPC) (Waks et al. 2003; Waks et al. 2004), 2D arrays of APDs (Yamamoto et al. 2006, Jiang et al. 2007) and time-multiplexed detectors (Achilles et al. 2003, Fitch et al. 2003) are not sensitive or are plagued by high dark count rate and long dead times in the telecommunication spectral windows. Arrays of single photon detectors SPDs additionally involve complex read-out schemes (Jiang et al. 2007) or separate contacts, amplification and discrimination (Dauler et al. 2007). We recently demonstrated an alternative approach (Divochiy et al. 2008; Marsili et al. 2009a), the Parallel Nanowire Detector (PND), which uses spatial multiplexing on a subwavelength scale to provide a single electrical output proportional to the photon number. The device presented significantly outperforms existing PNR detectors in terms of simplicity, sensitivity, speed, and multiplication noise (Divochiy et al. 2008). Here we present the working principle of the device (section 2), its fabrication process (section 3), the results of the optical characterization (section 4), an analysis of the device operation and corresponding design guidelines (section 5) and the first application of a PND to reconstruct an unknown incoming photon number statistics (section 6).

## 2. Photon number resolution principle

The structure of PNDs is the parallel connection of *N* superconducting nanowires (*N*-PND), each of which can be connected in series to a resistor *R*
_{0} (*N*-PND-R, Figure 1a). The detecting element is a 4-6 nm thick, 100 nm wide NbN wire folded in a meander pattern. Each section acts as a nanowire superconducting single photon detector (SSPD) (Verevkin et al. 2002). In SSPDs, if a superconducting nanowire is biased close to its critical current, the absorption of a photon causes the formation of a normal barrier across its cross section, so almost all the bias current is pushed to the external circuit. In PNDs, the currents from different sections can sum up on the external load, producing an output voltage pulse proportional to the number of photons absorbed.

The time evolution of the device after photon absorption can be simulated using the equivalent circuit of Figure 1b. Each section is modeled as the series connection of a switch which opens on the hotspot resistance *R*
_{hs} for a time t_{hs}, simulating the absorption of a photon, of an inductance L_{kin}, accounting for kinetic inductance (Kadin) and of a resistor *R*
_{0}. The device is connected through a bias T to the bias current source *I* and to the input resistance of the preamplifier *R*
_{out}. The *n* firing sections, in red, all carry the same current *I*
_{f} and the *N*-*n* still superconducting sections (unfiring), in blue, all carry the same current *I*
_{u}. *I*
_{out} is the current flowing through *R*
_{out}. Let *I*
_{B} be the bias current flowing through each section when the device is in the steady state. If a photon reaches the i^{th} nanowire, it will cause the superconducting-normal transition with a probability *η*
_{i}=*η*(*I*
_{B}/*I*
_{C}
^{(i)}), where *η* is the current-dependent detection efficiency and *I*
_{C}
^{(i)} is the critical current of the nanowire (Verevkin et al. 2002) (the nanowires have different critical currents, being differently constricted (Kerman et al. 2007)). Because of the sudden increase in the resistance of firing nanowire, its current (*I*
_{f}) is redistributed between the other *N*-1 unfiring branches and *R*
_{out}. This argument yields that if *n* sections fire simultaneously (in a time interval much shorter than the current relaxation time), part of their currents sum up on the external load.

The device shows PNR capability if the height of the current pulse through *R*
_{out} for *n* firing stripes
*n* times higher than the pulse for one
*δI*
_{lk}=*I*
_{u}-*I*
_{B} is negligible with respect to *I*
_{B}. The leakage current is also undesirable because it lowers the signal available for amplification and temporary increases the current flowing through the still superconducting (unfiring) sections, eventually driving them normal. Consequently, *δI*
_{lk} limits the maximum bias current allowed for the stable operation of the device and then the detection efficiencies of the sections. The leakage current depends on the ratio between the impedance of a section Z_{S} and *R*
_{out} and it can be reduced by engineering the dimensions of the nanowire (thus its kinetic inductance) and of the series resistor (see sec. 5). The design without series resistors simplifies the fabrication process, but, as Z_{S} is lower, *δI*
_{lk} significantly limits the detection efficiency of the device.

## 3. Fabrication

NbN films 3-4 nm thick were grown on sapphire (substrate temperature T_{S}=900 C (Gol'tsman et al. 2003; Gol'tsman et al. 2007)) or MgO (T_{S}=400 C (Marsili et al. 2008)) substrates by reactive magnetron sputtering in an argon–nitrogen gas mixture. Using an optimized sputtering technique, our NbN samples exhibited a superconducting transition temperature of *T*
_{C} =10.5 K for 40-Å-thick films. The superconducting transition width was Δ*T*
_{C} = 0.3 K.

Both the designs with and without the integrated bias resistors were implemented. Detector size ranges from 5x5 μm^{2} to 10x10 μm^{2} with the number of parallel branches varying from 4 to 14. The nanowires are 100 to 120 nm wide and the fill factor of the meander is 40 to 60%. The length of each nanowire ranges from 25 to 100 μm.

For the devices on MgO, the three nanolithography steps needed to fabricate the structure have been carried out by using an electron beam lithography (EBL) system equipped with a field emission gun (acceleration voltage 100 kV, 20 nm resolution). In the first step e-beam lithography is used to define pads (patterned as a 50 Ω coplanar transmission line) and alignment markers on a 450nm-thick polymethyl methacrylate (PMMA, a positive tone electronic resist) layer. The sample is then coated with a Ti-Au film (60 nm Au on 10nm Ti) deposited by e-gun evaporation, which is selectively removed by lift-off from un-patterned areas. In the second step, a 160nm thick hydrogen silsesquioxane (HSQ FOX-14, a negative tone electronic resist) mask is defined reproducing the meander pattern. The alignment between the different layers is performed using the markers deposited in the first lithography step. All the unwanted material, i.e. the material not covered by the HSQ mask and the Ti/Au film, is removed by using a fluorine based (CHF_{3}+SF_{6}+Ar ) reactive ion etching (RIE). Finally, with the third step the series resistors (85nm AuPd alloy, 50%-each in weight), aligned with the two previous layers, are fabricated by lift off via a PMMA stencil mask. Our process is optimized to obtain both an excellent alignment between the different e-beam nanolithography steps (error of the order of 100 nm) and a nanowire with high width uniformity (less than 10% (Mattioli et al. 2007)).

Details on the fabrication process of the devices on sapphire can be found in (Gol'tsman et al. 2007).

## 4. Device optical characterization

In this section we present the results of the optical characterization of PNDs and PND-Rs (Divochiy et al. 2008), i.e. their speed performance (section 4.2), the proof of their PNR capability and their detection efficiency at λ=1.3 μm (section 4.3).

### 4.1. Measurement setup

Electrical and optical characterizations have been performed in a cryogenic probe station with an optical window and in cryogenic dipsticks. The bias current was supplied through the DC port of a 10MHz-4GHz bandwidth bias-T connected to a low noise voltage source in series with a bias resistor. The AC port of the bias-T was connected to the room-temperature, low-noise amplifiers. The amplified signal was fed either to a 1 GHz bandwidth single shot oscilloscope, a 40 GHz bandwidth sampling oscilloscope, or a 150MHz bandwidth counter for time resolved measurements and statistical analysis. The devices were optically tested using a fiber-pigtailed, gain-switched laser diode at 1.3 μm wavelength (100ps-long pulses, repetition rate 26 MHz), a mode-lock Ti:sapphire laser at 700 nm wavelength (40ps-long pulses, repetition rate 80 MHz), or an 850 nm GaAs pulsed laser (30 ps-long pulses, repetition rate 100 kHz).

In the cryogenic probe station (Janis) the devices were tested at a temperature T=5 K. Electrical contact was realized by a cooled 50 Ω microwave probe attached to a micromanipulator, and connected by a coaxial line to the room-temperature circuitry. The light was fed to the PNDs through a single-mode optical fiber coupled with a long working distance objective, allowing the illumination of a single detector.

In the cryogenic dipsticks the devices were tested at 4.2 K or 2 K. The light was sent through a single-mode optical fiber either put in direct contact and carefully aligned with the active area of a single device or coupled with a short focal length lens, placed far from the plane of the chip to ensure uniform illumination. The number of incident photons per device area was estimated with an error of 5 %.

Throughout the paper, the single photon detection efficiency of an *N*-PND (
*η*) are defined with respect to the photon flux incident on the area covered by the device (active area A_{d}, typically 10 x 10 µm^{2}) or by one section (A_{d}/*N*), respectively.

### 4.2. Speed performance

Figure 2.a shows a *s*ingle-shot oscilloscope trace of the photoresponse of a 8.6x8 μm^{2} 5-PND under laser illumination (λ=700 nm, 80 MHz repetition rate). Pulses with five different amplitudes can be observed, corresponding to the transition of one to five sections. The measured 80 MHz counting rate represents an improvement of three orders of magnitude over most of the PNR detectors at telecom wavelength (Rosenberg et al. 2005, Fujiwara & Sasaki 2007, Jiang et al. 2007), with the only exception of the SSPD array (Dauler et al. 2007).

We investigated the temporal response of a 10x10 μm^{2} 4-PND-R probed with light at 1.3 μm wavelength using a 40 GHz sampling oscilloscope (Figure 2.b). All four possible amplitudes can be observed. The pulses show a full width at half maximum (FWHM) as low as 660ps. In a traditional 10x10 μm^{2} SSPD, the pulse width would be of the order of 10 ns FWHM, so the recovery of the output current *I*
_{out} through the amplifier input resistance is a factor ~4^{2} faster (see section 5.3), which agrees with results reported by other groups (Gol'tsman et al. 2007, Tarkhov et al. 2008). As shown in section 5.3, the very attractive N^{2} scaling rule for the output pulse duration unfortunately does not apply to the device recovery time.

### 4.3. Proof of PNR capability

Let an *N*-PND be probed with a light whose photon number probability distribution is S=[S(*m*)]=[s_{m}]. The probability distribution of the number of measured photons Q=[Q(*n*)]=[q_{n}] is related to S by the relation:

where
*n* photons are detected when *m* are sent to the device.

To infer whether a PND is able to measure the number of incoming photons, it can be probed with a Poissonian distribution S(*m*)=μ^{m}∙exp(-μ)/*m*! (μ: mean photon number). The limited efficiency *η*<1 of the detector is equivalent to an optical loss, and reduces the mean photon number to: μ^{*}=*η*μ. In the regime μ^{*}<<1,

Consequently, the probability Q(1) of detecting one photon is proportional to μ, Q(2) is proportional to μ^{2}, and so on.

A 10x10 μm^{2} 5-PND-R was tested with the coherent light of GaAs pulsed laser (λ=850 nm, 30 ps pulse width, 100kHz repetition rate), whose photon number distribution is close to a Poissonian. The photoresponse from the device was sent to a 150 MHz counter. The detection probabilities relative to one-, two- and three-photon absorption events are plotted for μ varying from 0.15 to 40 in Figure 3.a. As the mean single-photon detection efficiency
_{d}) is a few percent (Figure 3.b) and µ is a few tens, the condition *η*μ=μ^{*}<<1 is verified and (2) is therefore valid. Indeed, the fittings clearly show that

The device mean single-photon detection efficiency
^{-18} W/Hz^{1/2}), limited only by the room temperature background radiation coupling to the PND. This sensitivity outperforms most of the other approaches by one-two orders of magnitude (with the only exception of transition-edge sensors (Rosenberg et al. 2005), which require a much lower operating temperature).

## 5. PND design

In this section, we provide a detailed analysis of the device operation and guidelines for the design of PNDs with optimized performance in terms of efficiency, speed and sensitivity (see also (Marsili et al. 2009b)).

The first step is to define the relevant parameter space. The width of the nanowire (w=100 nm) and the filling factor (f=50%) of the meander are fixed by technology, the thickness of the superconducting film (t=4nm) is the optimum value yielding the maximum device efficiency and the active area (A_{d}=10 x 10 µm^{2}) is fixed by the size of the core of single mode fibers to which the device must be coupled. We consider single-pass geometries (no optical cavity), but the same guidelines can be applied to cavity devices with optimized absorption (Rosfjord et al. 2006). The parameters of the PND-R that can be used as free design variables are: the number of sections in parallel *N*, the value of the series resistor *R*
_{0} and the value of the inductance of each section *L*
_{0}. The number of sections in parallel *N* can be chosen within a discrete set of values (*N*=2, 3, 4, 6, 7, 10, 17), which satisfy the constraints of w, f, size of the pixel and that the number of stripes in each sections is to be odd (we consider the geometry of Figure 1a). The value of *L*
_{0} is the sum of the kinetic inductance of each meander L_{kin} and of a series inductance which can be eventually added. L_{kin} is not a design parameter, as it is fixed by w, t, f, A_{d} and *N*. If no series inductors are added (bare devices, *L*
_{0}=L_{kin}), the value of L_{0} for each *N* is listed in Table 1.

*L*

_{0}) and number of squares (SQ) of each section for all possible values of

*N*. The width of the nanowires is w=100 nm, the thickness is t=4 nm. The kinetic inductance per square was estimated (L

_{kin}/□=90 pH) from the time constant of the exponential decay of the output current (τ

_{out}=τ

_{f}=L

_{kin}/

*R*

_{out}, see sec. 5.3) for a standard 5x5µm

^{2}SSPD (Marsili et al. 2008).

An additional free parameter, relative to the read-out, is the impedance seen by the device on the RF section of the circuit *R*
_{out}, which is 50 Ω (of the matched transmission line) in the actual measurement setup (see section 4.1), but which can be varied from zero to infinite introducing a cold preamplifier stage.

The target performance specifications are the single-photon detection efficiency (*η*), the signal to noise ratio (SNR) and the maximum repetition rate (speed), which must be optimized under the constraints that the operation of the device is stable and that it is possible to detect a certain maximum number of photons (n_{max}) dependent on the specific application.

This section is organized as follows. First we present the electrical equivalent model of the device developed to study its working principle and to define design guidelines (section 5.1). Then we define the figures of merit of the device performance in terms of efficiency (section 5.2), speed (section 5.3) and sensitivity (section 5.4) and we analyze their dependency on the design parameters (*L*
_{0}, *R*
_{0}, *R*
_{out}, *N*).

### 5.1. Electrical model

Although a comprehensive description of PND operation should combine thermal and electrical modeling of the nanowires (Yang et al. 2007), it is possible to use a purely electrical model (see section 2 and Figure 1b) to make a reliable guess on how the device performance varies when moving in the parameter space (Marsili et al. 2009b).

In this model, the dependence of L_{kin} on the current flowing through the nanowire was disregarded, and it was assumed constant. Furthermore, it has been shown (Yang et al. 2007) that changing the values of the kinetic inductance of an SSPD or of a resistor connected in series to it results in a change of the hotspot resistance and of its lifetime, eventually causing the device to latch to the normal state. The simplified analysis presented here does not take into account these effects, and considers both *R*
_{hs} and t_{hs} as constant (*R*
_{hs}=5.5 kΩ, t_{hs}=250ps), and that device cannot latch. However, the results of this approach can still quantitatively predict the behavior of the device in the limit where the fastest time constant of the circuit τ_{f} (see section 5.3) is much higher than the hotspot lifetime (τ_{f}>>t_{hs}), and give a reasonable qualitative understanding of the main trends of variation of the performance of faster devices (τ_{f}~t_{hs}).

To gain a better insight on the circuit dynamics (see sec. 5.3) and to reduce the calculation time, the *N*+1 mesh circuit of Figure 1.b can be simplified to the three mesh circuit of Figure 4.a applying the Thévenin theorem on the *n* firing sections and on the remaining *N*-*n* still superconducting (unfiring) sections, separately. Figure 4.b to d show the simulation results for the time evolution of the currents flowing through *R*
_{out} and through the unfiring (*I*
_{u}) and firing (*I*
_{f}) sections of a PND with 6 sections and integrated resistors (6-PND-R) and for the number of firing sections *n* ranging from 1 to 6. As *n* increases, the peak values of the output current (*I*
_{out,}
Figure 4.b) and of the current through the unfiring sections (*I*
_{u}, Figure 4.c) increase. The firing sections experience a large drop in their current (*I*
_{f}, Figure 4.d), which is roughly independent on *n*. The observed temporal dynamics will be examined in the following sections.

## 5.2 Current redistribution and efficiency

Let*n*sections fire simultaneously. The stability requirement translates in the condition that for each unfiring section:

*n*,

*η*), which, for a certain nanowire geometry (i.e. w, t fixed), is a monotonically increasing function of

*I*

_{B}/

*I*

_{C}(Verevkin et al. 2002). For instance, to detect a single photon (at λ=1.3 μm, T=1.8K) in a section with an efficiency equal to 80% of the maximum value set by absorption (~32%, (Gol'tsman et al. 2007)),

*I*

_{B.}Therefore the leakage current strongly affects the performance of the device and it is to be minimized, which makes it very important to understand its dependency from the design parameters:

The leakage current can be investigated just in the case of *n*=1, as the design guidelines drawn from this analysis still apply to higher *n* (Marsili et al. 2009b). The dependency of
*N* and *L*
_{0} at fixed *R*
_{0} and *R*
_{out} (both equal to 50 Ω) is shown in Figure 5.a: an orange line highlights bare devices (*L*
_{0}=L_{kin}, see Table 1) and the colored bars are relative to devices which respect the constraints on the geometry of the structure (*L*
_{0}>L_{kin}), while the grey bars refer to purely theoretical devices which just show the general trend. For any *N*, the current redistribution increases with decreasing *L*
_{0}, as the impedance of each section decreases. Keeping *L*
_{0} constant,
*N*, as the current to be redistributed is fixed and the number of channels draining current increases. For this reason also the increase of redistribution with decreasing *L*
_{0} becomes weaker for high *N*.

The dependency of
*R*
_{0} and *R*
_{out} (for the same *N*, *L*
_{0}) is very intuitive (Marsili et al. 2009b). Indeed, the redistribution decreases as *R*
_{0} increases because the impedance of each section increases with respect to the output resistance. For the same reason,
*R*
_{out} is decreased.

In conclusion, the result of this simplified analysis is that, to minimize the leakage current and thus maximize the efficiency, *N*, *L*
_{0} and *R*
_{0} must be made as high as possible and *R*
_{out} as low as possible. We note however that *R*
_{0} cannot be increased indefinitely to avoid that the nanowire latches to the hotspot plateau before *I*
_{B} reaches *I*
_{C} (Marsili et al. 2008).

### 5.3. Transient response and speed

Before proceeding to the analysis of the SNR and speed performances of the device, it is necessary to discuss the characteristic recovery times of the currents in the circuit.

The transient response of the simplified equivalent electrical circuit of the *N*-PND (Figure 4.a) to an excitation produced in the firing branch can be found analytically. Therefore, the transient response of the current through the firing sections *I*
_{f}, through the unfiring sections *I*
_{u} and through the output *I*
_{out} after the nanowires become superconducting again (t≥t_{hs}) can be written as:

where τ_{s}= *L*
_{0}/*R*
_{0} and τ_{f}= *L*
_{0}/( *R*
_{0}+NR_{out}) are the “slow” and the “fast” time constant of the circuit, respectively. This set of equations describes quantitatively the time evolution of the currents after the healing of the hotspot in the case τ_{f}>>t_{hs}, and it provides a qualitative understanding of the recovery dynamics of the circuit for shorter τ_{f}.

The recovery transients (t≥t_{hs}) of *I*
_{out}, *δI*
_{lk} and *I*
_{f} for a 4-PND-R simulated with the circuit of figure Figure 4.a are shown in figure 6a, b, c, respectively (in blue) for different number of firing sections (*n*=1 to 4). As *n* increases from 1 to 4, the recoveries of *I*
_{out} and *δI*
_{lk} change only by a scale factor. On the other hand, the transient of *I*
_{f} depends on *n* and becomes faster increasing *n*, as qualitatively predicted by the first of equations (3). Indeed, *I*
_{f} consists in the sum of a slow and a fast contribution, whose balance is controlled by the number of firing sections *n*. To prove the quantitative agreement with the analytical model in the limit τ_{f}>>t_{hs}, the simulated transients of *I*
_{out}, *δI*
_{lk} and *I*
_{f} have been fitted (figure 6a, b, c, respectively, in red) using the set of equations (3), and four fitting parameters (τ_{s}, τ_{f}, a time offset t_{0} and a scaling factor K). The values of τ_{s} and τ_{f} obtained from the three fittings (of *I*
_{out}, of *δI*
_{lk} and of the whole set of four *I*
_{f} for *n*=1,…, 4) closely agree with the values calculated from the analytical expressions presented above and the parameters of the circuit (τ_{s}
^{*}=2.30 ns,τ_{f}
^{*}=460 ps).

To quantify the speed of the device, we take f_{0}=(t_{reset})^{-1} as the maximum repetition frequency, where t_{reset} is the time that *I*
_{f} needs to recover to 95% of the bias current after a detection event. According to the results presented above, which are in good agreement with experimental data (Figure 2.b), *I*
_{out} decays exponentially with the same time constant for any *n* (τ_{out}=τ_{f}), which, for a bare *N*-PND, is N^{2} times shorter than a normal SSPD of the same surface (Gol'tsman et al. 2007, Tarkhov et al. 2008). This however does not relate with the speed of the device. Indeed, t_{reset} is the time that the current through the firing sections *I*
_{f} needs to rise back to its steady-state value (*I*
_{f}~*I*
_{B}). In the best case of *n*=*N*, *I*
_{f} rises with the fast time constant τ_{f}, but in all other cases the slow contribution becomes more important as *n* decreases (see Figure 4.d and figure 6.c), until, for *n*=1, *I*
_{f}~[1-exp(-t/τ_{s})]. The speed performance of the device is then limited by the slow time constant (t_{reset}~3∙τ_{s}), which means that an *N*-PND is only *N* times faster than a normal SSPD of the same surface, being as fast as a normal SSPD whose kinetic inductance is the same as one of the *N* section of the *N*-PND.

### 5.4. Signal to noise ratio

The peak value and the duration of the output current pulse are a function of the design parameters (see below and section 5.3, respectively). As the output pulse becomes faster, amplifiers with larger bandwidth are required and thus electrical noise become more important. To assess the possibility to discriminate the output pulse from the noise, we define the signal to noise ratio (SNR) as the ratio between the maximum of the output current
*I*
_{n},

The peak value of the output current when *n* sections fire simultaneously (see Figure 4b, relative to a 6-PND-R) can be written as:

where the starred values refer to the time t=t^{*} when the output current peaks.

As *n*=1 represents the worst case, to evaluate the performance of the device in terms of the SNR, the dependency of
*N* and *L*
_{0} at fixed *R*
_{0} and *R*
_{out} (both equal to 50 Ω) is shown in Figure 5.b. Inspecting the values of
*I*
_{B}, which is due to the fact that the output current and of the leakage current peak at two different times t* and t_{lk}, respectively (Figure 4.b, c). Furthermore, as t_{lk}>t*, the output current is not significantly affected by redistribution, because *I*
_{out} is maximum when *δI*
_{lk} is still beginning to rise.

The expression for t_{lk} can be derived from (3): t_{lk}= *L*
_{0}/(*N*∙*R*
_{out})ln(1+*N*∙*R*
_{out}/*R*
_{0}), which means that increasing the device speed (decreasing *L*
_{0} or *R*
_{0}, *N* or *R*
_{out}) makes the redistribution faster and then

So, for any given *N*,
*L*
_{0}, both because
_{lk} is lower. Keeping *L*
_{0} constant,
*N* because even though

The redistribution speed-up explains the dependency of
*R*
_{0} (for the same *N*, *L*
_{0}). Indeed, even though
*R*
_{0} increases (see section 5.2), the output current decreases due to the decrease of t_{lk}: *δI*
_{lk}
^{(1)*} increases despite the decrease of the peak value of the leakage current. On the other hand, a decrease in *R*
_{out} makes the redistribution much less effective, as t_{lk} decreases slower with decreasing *R*
_{out} than with increasing *R*
_{0} (Marsili et al. 2009b).

In conclusion, to maximize the output current, *N*, *R*
_{0} and *R*
_{out} must be minimized, while *L*
_{0} must be made as high as possible.

The rms value of noise-current at the preamplifier input *I*
_{n} can be written as
_{n} is the noise spectral power density of the preamplifier and Δf is the bandwidth of the output current *I*
_{out}, which is estimated as Δf=1/τ_{out}, where τ_{out}=τ_{f}= *L*
_{0}/(*R*
_{0}+*NR*
_{out}) is the time constant of the exponential decay of *I*
_{out} (see sec. 5.3). *I*
_{n} is then a function of the parameters of the device and of the read-out through S_{n} and τ_{f}, and like *I*
_{out} it is minimized minimizing *N*, *R*
_{0} and *R*
_{out} and maximizing *L*
_{0}.

The same optimization criteria apply then naturally to the SNR. The dependence of the SNR from *N* and *L*
_{0} is shown in Figure 7 for cryogenic (77 K working temperature, in blue) and room-temperature amplifiers (in yellow). Amplifiers with different -3 dB bandwidths have been considered, depending on the bandwidth of the output current pulse that they were supposed to amplify. Depending on the amplifier bandwidth, noise figures of F=0.44 to 1.8 dB (F=1.1 to 5 dB) have been considered in the calculation of Sn for the room-temperature (cryogenic) amplifier. The input resistance is *R*
_{out}=50Ω.

The main design guidelines which can be deduced from the analysis of sections 5.2 to 5.4 are summarized in Table 2. The type of dependency of
_{0},
*I*
_{n} from the design parameters (*L*
_{0}, *R*
_{0}, *R*
_{out}, *N*) is indicated.

## 6. Application to the measurement of photon number statistics

We wish to determine whether the PND can be used to measure an unknown photon number probability distribution S (see section 4.3). Indeed, the light statistics measured with a PND differ from the original one due to non-idealities such as the limited number of sections and limited and non-uniform efficiencies (*η*
_{i}) of the different sections.

In this section, we present the modeling tools (section 6.1) used to fully characterize the device (section 6.2) (Marsili et al. 2009a) and to develop an algorithm to estimate the photon number statistics of an unknown light (section 6.3) (Marsili et al. 2009b).

### 6.1. Modeling and simulation

Equation (1) may be rewritten in a matrix form as Q=P^{N}∙S, where
*N*-PND. Assuming that the illumination of the device is uniform, the parallel connection of *N* nanowires can be considered equivalent to a balanced lossless *N*-port beam splitter, every channel terminating with a single photon detector (SPD) (Figure 8.a). Each incoming photon is then equally likely to reach one of the *N* SPDs (with a probability 1/*N*). Each SPD can detect a photon with a probability *η*
_{i} (i=1,..,*N*) different from all the others, and gives the same response for any number (*n*≥1) of photons detected (Figure 8.b). The number of SPDs firing then gives the measured photon number. Two classes of terms in P^{N} can be calculated directly, the others being derived from these by a recursion relation (Divochiy et al. 2008, Marsili et al. 2009a). These terms are the probabilities
*m*≤*N* photons sent are detected and
*m* are sent.

In the case of zero detections,

which assumes that a photon incident in the i^{th} nanowire fails to be detected with an independent probability of (1-*η*
_{i}). The sum in (4) accounts for all the possible combinations when taking *m* elements in an ensemble of *N* with order and with repetition (permutations with repetitions). This because more than one photon can hit the same stripe (which gives the repetition), and the photons are considered distinguishable (which gives the order).

In the case that all the photons are detected, since *m* photons must reach *m* distinct nanowires:

The sum in (5) accounts for all the possible combinations when taking *m* elements in an ensemble of *N* with order and without repetition (permutations without repetitions). This because only one photon can hit the same stripe (which gives the non-repetition), and the photons are considered distinguishable (which gives the order).

The recursion relation for

The first term on the right-hand side of (6) is the probability that *n* photons are detected when *m*-1 are sent, times the probability that the *m*
^{th} photon reaches one of the *n* nanowires already occupied (first term in the square brackets) or that it fails to be detected reaching one of the *N*-*n* unoccupied nanowires (second term in the square brackets). To clarify how the latter probability is derived, it is sufficient to consider a particular configuration *k* (see Figure 8.b) of *n* firing stripes (which have already detected a photon) and *N*-*n* unfiring stripes (still active). The probability that the incoming photon will not be detected when incident on any of the *N*-*n* unfiring stripes is then written as:

where
*N*-*n* stripes active in the k^{th} configuration of (*n*)firing-(*N*-*n*)unfiring stripes considered. So a mean must be calculated on all the possible (*n*)firing-(*N*-*n*)unfiring configurations for the *N* stripes. Let C be the number of all these configurations. The mean is then calculated summing C terms of the type (7), and dividing by C:
*N*-*n* elements in an ensemble of *N*, and it is given by the binomial coefficient:
*n*-1 photons are detected when *m*-1 are sent times the probability that the m^{th} photon reaches one of the *N*-(*n*-1) unoccupied nanowires and it is detected.

To prove the consistency of the analytical model, the probability distribution of the number of measured photons Q calculated from P^{N} by (1) was cross-checked with the Q_{MC} resulting from a Monte Carlo simulation (Marsili et al. 2009a). The input parameters of the simulation are the incoming photon number probability distribution S, the number of parallel stripes *N*, and the vector of the single-photon detection efficiencies of the different sections of the device *η*=[*η*
_{i}].

### 6.2. Matrix of conditional probabilities

It has been shown (Achilles et al. 2004, Lee et al. 2004) that an unknown incoming photon number distribution S can be recovered if Q and P^{N} are known.

Let an *N*-PND be probed with a light whose photon number probability distribution is S, and its output be sampled H times. The result of the observation can be of *N*+1 different types (i.e. 0,..., *N* stripes firing), so an histogram of the H events can be constructed, which can be represented by a (*N*+1)-dimensional vector r=[r_{i}], where r_{i} is the number of runs in which the outcome was of the i^{th} type. The expectation value of the statistics obtained from the histogram is E[Q_{ex}=r/H]=Q. Considering equations (4) to (6), it is clear that the matrix of the conditional probabilities of a *N*-PND depends only on the vector of the *N* single-photon detection efficiencies of the different sections of the device *η*=[*η*
_{i}]. The vector *η* can be determined from the statistics Q_{ex} measured when probing the device with a light of known statistics S as described in the following.

A 5-PND (A_{d}=8.6x8 μm^{2}) was tested with the coherent emission from a mode-locked Ti:sapphire laser under uniform illumination, whose photon number probability distribution is close to a Poissonian and could be fully characterized by the mean photon number μ with a power measurement. To determine Q_{ex}, histograms of the photoresponse voltage peak V_{pk} were built for values of μ ranging from ~1 to ~100. The signal from the device was sent to the 1 GHz oscilloscope, which was triggered by the synchronization generated by the laser unit. The photoresponse was sampled for a gate time of 5ps, making the effect of dark counts negligible. The discrete probability distribution Q_{ex} was reconstructed from the continuous probability density q(V_{pk}) fitting the histograms to the sum of 6 Gaussian distributions (corresponding to the five possible pulse levels plus the zero level) and calculating their area (Figure 9).

The probability distribution of the number of measured photons Q (expressed by(1)) was then fitted to the experimentally measured distribution Q_{ex} using *η* as a free parameter (Figure 10). The resulting *η* and matrix of the conditional probabilities are shown in Figure 11. The fitted efficiencies are rather uniform (2.9±0.5%), indicating a high-quality fabrication process. The value of *η* obtained from the fitting was then used as an input parameter of Monte Carlo simulations (see above) used to calculate Q_{MC} for each value of μ. The three sets of values for the photocount statistics of six levels are in good agreement over almost two orders of magnitude of μ, confirming the validity of the analytical model.

### 6.3. Maximum-Likelihood (ML) estimation

*i. ML method*

The P^{N} matrix provides a full description of the detector. Once P^{N} is known, several approaches can be used to reconstruct S from the histogram r. In the case no assumptions on the form of S are made, the maximum likelihood (ML) method is the most suitable, as it is the most efficient in solving this class of problems (Eadie et al.).

Let R= *R*
_{0},…, R_{N} be the random vector of the populations of the (*N*+1) different bins of the histogram after H observations. The joint probability density function L(r|Q) for the occurrence of the particular configuration r=r_{0},…, r_{N} of R is called the likelihood function of r and it is given by (Eadie et al.):

where Q=[q_{i}] is the probability distribution of the number of measured photons, i.e. the vector of the probabilities to have an outcome in the bin i (i=0,…, *N*) in a single trial.

Considering Q as a function of S through (1), we can rewrite the likelihood function of the vector r, depending on the parameter S:

which is then the probability of the occurrence of the particular histogram r when the incoming light has a certain statistics S.

As r is measured and then it is known, L(r|S) can be regarded as a function of S only, i.e. L(r|S) is the probability that a certain vector S is the incoming probability distribution when the histogram r is measured. The best estimate of the incoming statistics which produced the histogram r according to the ML method is the vector S_{e}which maximizes L(r|S), where r is treated as fixed. So, the estimation problem can in the end be reduced to a maximization problem.

*ii. Description of the algorithm*

For numerical calculations, it is necessary to limit the maximum number of incoming photons to m_{max} (in the following calculations, m_{max}=21). As S is a vector of probabilities, the maximization must be carried out under the constraints that the s_{n} are positive and that they add up to one. The positivity constraint can be satisfied changing variables:

where Σ=[σ_{n}] and C is a constant. The condition that the s_{n} add up to one can be taken into account using the Lagrange multipliers method:

After developing (Banaszek 1998) the set of m_{max}+2 gradient equations
_{max}+1 nonlinear equations to be solved respect to Σ is:

for l=0,…, m_{max}. The set of equations (11) can be solved by standard numerical methods.

*iii.ML reconstruction*

To test the effectiveness of the reconstruction algorithm, a 8.6x8 μm^{2} 5-PND was tested with the coherent emission from a mode-locked Ti:sapphire laser. Q_{ex} was determined as described in section 6.2.

The device was already characterized in terms of its conditional probability matrix P^{5} (Figure 11), so it was possible to carry out the ML estimation of the different incoming distributions with which the device was probed. Because of the bound on the number of incoming photons which is possible to represent in our algorithm (m_{max}=21) and as, for a coherent state, losses simply reduce the mean of the distribution, the ML estimation was performed considering µ^{*}=µ/10 and *η*
^{*}=10*η* (the efficiency of each section being lower than 10%).

Figure 12 shows the experimental probability distribution of the number of measured photons Q_{ex} obtained from the histograms measured when the incoming mean photon number is µ=1.5, 2.8, 4.3 photons/pulse (Figure 12.a, b, c respectively, in red), from which the incoming photon number distribution is reconstructed. The ML estimate of the incoming probability distribution S_{e} is plotted in Figure 12.d, e, f, (green), where it is compared to the real incoming probability distribution S (blue). The estimation is successful only for low photon fluxes (µ=1.5, 2.8 Figure 12.d, e) and it fails already for µ=4.3 (Figure 12.f). In Figure 12.a, b, c, Q_{ex} (red) is compared to the ones obtained from S and S_{e} through relation (1) (Q, Q_{e} in blue and green, respectively).

The main reasons why the reconstruction fails are not the low efficiencies of the sections of the PND or the spread in their values, but rather the limited counting capability (*N*=5) and a poor calibration of the detector, i.e. an imperfect knowledge of its real matrix of conditional probabilities. This assessment is supported by the following argument. If we generate Q_{ex} with a Monte Carlo simulation (see section 6.1) using the same *η* vector of Figure 11 and a Poissonian incoming photon number distributions and then we run the ML reconstruction algorithm (using the same P^{5}
_{,} which this time describes perfectly the detector), S can be estimated up to much higher mean photon numbers (μ≥16, see Figure 13). Additional simulations will be needed to evaluate the performance of PNDs for the measurement of other, nonclassical photon number distributions. However, to alleviate this problem, a self-referencing measurement technique might be used (Achilles et al. 2006).

## 7. Discussion on the counting capability

Several factors may limit the counting capability (M_{max}) of a PNR detector.

One is the detection efficiency. From (5), assuming the detector saturation is negligible (*n*<<*N*) and that all the branches are equal (*η*
_{i}=*η*), the probability Q(*n*) of detecting *n* photons is proportional to *η*
^{n}. In the PND tested
*n*-photon states measurement for *n*>>1. Nevertheless, the *η* of SSPDs,

which are based on the same detection mechanism, can be increased up to ~60% (Rosfjord et al. 2006), and could potentially exceed 90% using optimized optical cavities. We also stress that uniform illumination of the wires is needed to achieve the optimum performance.

The second limitation is the intrinsic noise of the detector. As the currents from the sections of the PND are summed up to build the output, pulse height discrimination is used to achieve photon number resolution. This makes the noise performance of the device critical for its counting capability, as independent noisy signals are summed. Indeed, photon-number discrimination can be performed as long as the noise on the signal amplitude remains lower than the height of the one-photon pulse. The noise properties of any avalanche-based photon counting device are limited by its inner multiplication noise. In other avalanche PNR detectors (Waks et al. 2003, Waks et al. 2004, Zambra et al. 2004, Yamamoto et al. 2006, Fujiwara & Sasaki 2007, Gansen et al. 2007, Kardynal et al. 2007) the amplitude of the output signal is directly proportional to the number of carriers generated by single photon absorption events through a multiplication process which is intrinsically noisy. The noise on the multiplication gain is then completely transferred to the signal, which is then affected by a fluctuation of the same order. In contrast, with PNDs, the noisy avalanche carrier-multiplication process (Semenov et al. 2001) causes a fluctuation only in the resistance *R*
_{hs} of the branch driven normal after the absorption of a photon and not in the output current. Indeed, the amplitude of the photocurrent peak is determined by the partition between the fluctuating resistance *R*
_{hs} of few kΩ and a resistance *R*
_{out} almost 2 orders of magnitude lower, which is of fixed value. Comparing the broadening of the histogram peaks relative to different numbers of detected photons *n* (Figure 9), no multiplication noise buildup is observable, as the variance of the peak does not increase with *n*. The broadening of the peaks is then exclusively due to electrical noise originating from amplifiers and is not a fundamental property of the detector. To a good approximation the excess noise factor F (McIntyre 1966) of the PND is then close to unity and is not limiting M_{max}, which is not the case for most of the other approaches to PNR detection (Waks et al. 2003, Waks et al. 2004, Zambra et al. 2004, Rosenberg et al. 2005, Yamamoto et al. 2006, Fujiwara and Sasaki 2007, Gansen et al. 2007, Kardynal et al. 2007).

In PNDs, a third limitation to M_{max} arises from the leakage current *δI*
_{lk}, which limits the bias current and therefore *η*. However, as discussed in section 5.2, this issue can be overcome with a careful design of the device.

## 8. Conclusions

A new PNR detector, the Parallel Nanowire Detector, has been demonstrated (Divochiy et al. 2008, Marsili et al. 2009a), which significantly outperforms existing approaches in terms of sensitivity, speed and multiplication noise in the telecommunication wavelength range. In particular, it provides a repetition rate (80 MHz) three orders of magnitude larger than any existing detector at telecom wavelength (Rosenberg et al. 2005, Fujiwara & Sasaki 2007, Jiang et al. 2007), and a sensitivity (NEP=4.2x10^{-18} W/Hz^{1/2}) one-two orders of magnitude better, with the exception of transition-edge sensors (Rosenberg et al. 2005) (which require a much lower operating temperature).

An electrical equivalent model of the device was developed to study its operation and to perform its design (Marsili et al. 2009b). In particular, we found that the leakage current significantly affects only the PND detection efficiency, while it has a marginal effect on its signal to noise ratio. To gain a better insight on the device dynamics, the (*N*+1)-mesh equivalent circuit of the *N*-PND was simplified and reduced to a three mesh circuit, so that the analytical expression of its transient response could be easily found. With this approach, we could predict a physical limit to the recovery time of the PND, which is slower than that previously estimated. Furthermore, the figures of merit of the device performance in terms of efficiency, speed and sensitivity (
_{0}, SNR) were defined and their dependency on the design parameters (L_{0}, R_{0}, R_{out}, N) was analyzed.

To prove the suitability of the PND to reconstruct an unknown light statistics by ensemble measurements, we developed modeling tools to fully characterize the device and a maximum likelihood estimation algorithm (Marsili et al. 2009a, Marsili et al. 2009b). Testing a 5-PND with a Poissonian light we found that the reconstruction of the incoming photon number probability distribution to be successful only for low photon fluxes, most likely due to the limited counting capability (N=5) and the poor calibration (i.e. the imperfect knowledge of the real matrix of conditional probabilities) of the detector used, and not to its low detection efficiency (