The historical development of technology can inform future innovation, and while theses and review articles attempt to set technologies and methods in context, few can discuss the historical background of a scientific paradigm. In this chapter, the nature of the photon is discussed along with what physical mechanisms allow detection of single-photons using solid-state semiconductor-based technologies. By restricting the scope of this chapter to near-infrared, visible and near-ultraviolet detection we can focus upon the internal photoelectric effect. Likewise, by concentrating on single-photon semiconductor detectors, we can focus upon the carrier-multiplication gain that has allowed sensitivity to approach the single-photon level. This chapter and the references herein aim to provide a historical account and full literature review of key, early developments in the history of photodiodes (PDs), avalanche photodiodes (APDs), single-photon avalanche diodes (SPADs), other Geiger-mode avalanche photodiodes (GM-APDs) and silicon photo-multipliers (Si-PMs). As there are overlaps with the historical development of the transistor (1940s), we find that development of the p-n junction and the observation of noise from distinct crystal lattice or doping imperfections – called “microplasmas” – were catalysts for innovation. The study of microplasmas, and later dedicated structures acting as known-area, uniform-breakdown artificial microplasmas, allowed the avalanche gain mechanism to be observed, studied and utilised.
- single-photon avalanche diodes
- p-n junctions
- internal photoelectric effect
- photon counting
Optics has seen significant progress in the last 100 years. We now take as routine that we can detect single-photons, count them and time them in such an accurate manner that we can study phenomena that would have been treated as science fiction in the days of quantum pioneers such as Max Planck, Albert Einstein and Werner Heisenberg. We are able to use time of flight, at the single-photon level, to image objects in three-dimensions, provide laser ranging for autonomous car applications, and monitor the timing of a few scattered photons to observe objects blocked from view. In biomedicine, we detect single-photons from fluorescing biological samples, computing the fluorescence lifetime and using it as a window into reactions. Quite routinely we use photon counters in the form of photomultiplier tubes (PMTs) to detect the co-incidence of gamma radiation, using scintillators for positron emission tomography (PET) and charge counting detectors for accurate X-ray computed tomography (CT). Newer techniques such as Raman spectroscopy require sensitive instrumentation, and increasingly spectroscopy is being used to illuminate biological phenomena. In physics, photon counting is crucial in high-energy physics (e.g. ATLAS and CMS at CERN)1 with many experiments testing quantum theory only becoming possible with technological progress in these technologies. In communications, quantum key distribution (QKD) and few-photon communication links have been achieved.
Despite this, to study and discuss photon counting, it is also crucial to understand the historical underpinnings of the technologies in use today. Surely, we must not take technologies for granted, but understand where such innovation came from? If we wish to progress forward, surely, we must know, acknowledge and expand upon what has been tried, tested and shown to be successful, (or not), in the past?
This chapter will focus on detection of light using semiconductors such as Silicon and Germanium. We will discuss the nature of the photon and the mechanisms whereby light can interact with matter for that detection. We are fortunate that light detection, especially in the visible spectrum utilising the internal photoelectric effect, is a mature technology. But how did we progress towards the ability to detect, count and time single-photons, the solitary quanta of electromagnetic radiation? We will focus on a phenomenon known as photo-carrier multiplication or avalanche, which can provide either linear gain or run-away generation that increases the sensitivity of an optical detector beyond that we could achieve with amplifiers and other gain mechanisms. There are numerous technical review articles on single-photon detectors [1, 2], avalanche diodes  and single-photon avalanche diodes (SPADs) [4, 5]. However, these take the typical technical view and rarely – and if at all, poorly – discuss the historical development of these scientific and engineering breakthroughs. This chapter will therefore provide a literature review of early historical development of Silicon and Germanium solid-state semiconductor detectors, the use of p-n junctions, the noise sources that were observed within Shockley’s first transistor devices, and how this lead to the discovery and utilisation of avalanche gain.
2. The photon: philosophy, nature and theory for engineers
Before discussing the history of semiconductor photon counters, the question arises of: What is a photon? As there are numerous philosophical interpretations as to the nature of the photon, these will be briefly covered. As scientists, we must not forget questions of interpretation as it is easy to view theories as gospel and particles as real physical entities. We can easily forget the assumptions, inferences and observations by proxy that have led to the current shared scientific view. This section will also cover several fundamental theories before we can discuss detection and the properties we now exploit in many applications.
There are two major schools of thought when it comes to evidence interpretation. We can view observations of light’s effects from either a wave or particle viewpoint, the so-called wave-particle duality problem. However, when it comes to existence and knowledge, we can take either a ‘realist’ or ‘anti-realist’ viewpoint .
Realists hold that objects, conditions and processes, if described by correct, fully-evidenced theories, are indeed real. The photon is a real particle, or a physical wave, or its wave packet is an entity. Realists seek the truth of nature using scientific methods and treat a robust theory also as the real and correct representation of how nature operates. Realism is divided into those that are (i) realists regarding theories but not objects, (ii) realists concerning objects but not theories, and (iii) realists about both.
Anti-Realists state there is no real object, condition or process. They hold theoretical entities and the theory itself as purely ways to visualise a phenomenon, aiding understanding or a method only of prediction. They state that, linguistically, we need a shared nomenclature and conceptual model to think about complex notions.
Both have their merits, and the reader must decide where they sit on this spectrum. There are several issues that prompt further thought. If a theory is shown to be accurate, thus implying an entity, if that theory is later shown to be incorrect by new evidence, then the original logical construction for that entity requires a new theory. We might incorporate entities of similar type and therefore share the nomenclature in our new conceptual model, however the original entity has now been replaced. The theory of Phlogiston is a useful case, where once the theory was falsified, science propagated the oxidisation theory of combustion. The entity Phlogiston, despite having been the ‘embodiment of truth’, was no longer logically founded .
Wave-particle duality opens a further issue, in that we can interpret observations from both viewpoints. With both being theories that accurately fit experiment, provide prediction and aid understanding, the theory to use depends on the phenomena being considered. If both theories are robust, is it the particle that is real or the wave? Steven Hawking  formed a third interpretation known as ‘model-dependent realism’. This states that reality should be understood using our models and theories, but that it is not possible to prevent a theory being falsified by future experimental findings. At best, a theory can only be true with respect to current observations. This is related to Kuhn’s idea of theory choice  and the concept of theory falsifiability . A theory should be evaluated by how accurately it describes observation and if it can make predictions of hitherto unseen phenomena. If there are several logical, tested theories that overlap, such as wave-particle duality, then model-dependent realism holds that multiple equally valid realities exist. This does not sit well with many; therefore, a pseudo-realist view can be taken where we choose which model we need. This dichotomy and choice of theory, shows that we are never truly realist with respect to these theoretical entities. Put simply, the photon is nothing more than a useful ‘aide-memoire’ when we choose to use Maxwell’s equations for wave motion.
2.1. Photon or anti-photon: an elusive concept
In Loudon’s treatise on the quantum theory of light , it is made clear from the outset that the word photon is somewhat of a vague, theory-loaded term  that can lead to confusion. The word was coined by G. Lewis in a 1926 Nature paper. This is surprising as many would cite Einstein’s 1905 paper on the photoelectric effect  as a suitable definition of a ‘photon’ i.e. a single quantum of electromagnetic radiation . The history of quantized optics is described by Lamb  and Loudon , with much modern theory and philosophical questions discussed in . However, Lamb was highly critical of the term suggesting it should be used only by “properly qualified people”. Indeed, Einstein famously stated, “All the fifty years of conscious brooding have brought me no closer to answering the question, “What are light quanta?” Of course, today every rascal thinks he knows the answer, but he is deluding himself.”
One may ask where Lamb’s criticism of the word photon comes from. Firstly, Lewis when coining the term, suggested it as a real, physical particle (i.e. realism), as a method of explaining chemical valence. Specifically, he hypothesised the photon as a mediator of radiation from one atom to another, helping to explain how a molecule such as Hydrogen gas can be stabilised by two electrons that sometimes can have a strong attractive rather than repulsive force. However, he explicitly  denied it related to the quantum of light discussed by Planck and Einstein. Secondly, Lamb expounds the view that all uses of the term ‘photon’ can be more accurately thought of using quantization of wave interpretations, explicitly he states: “With more complicated states, it is terribly difficult to talk meaningfully about ‘photons’ at all”. He cites the work of Wentzel and Beck (1926) as they show that the photoelectric effect can be described by quantum theory, without the use of light quanta .
2.2. Photons: energy and statistics
As Loudon suggests, the impression the word gives is of an indistinct, fuzzily-bound globule of light that travels from point A to B through optical equipment or free space. Many would conceive photons as bullets travelling as a stream making up distinct rays. Despite that view, a photon can be more correctly thought of as an electromagnetic field within a cavity of length . As with sound waves, there are an infinite set of spatial-modes discretized into integer divisions of the cavity spacing, i.e. etc. Extension to open cavities can be achieved by, considering an experiment of finite size but with no identifiable cavity or viewing the system as discrete travelling-wave modes . This scenario can be described by a quantized harmonic-oscillator of angular frequency, ω. If we take a single spatial-mode, we can write its Hamiltonian in the form of Eq. (1) [9, 12]. This relates to a pseudo particle with an effective mass, μ, an angular frequency, ω, a position, x and a momentum, p. This a wave-mechanical description of a particle, where matrix mechanics (Heisenberg 1925–1926) can be used to form an analogy with the 4-dimensional (x, y, z, t) electric and magnetic fields within a cavity.
As Lamb suggests, rather than using x, y, z as spatial coordinate labels for a spatial-mode, the vector k can be used which is called the wave-vector. For a simple 1-D case, for the fundamental, i.e. a single wave period within the cavity, for the second harmonic and for n periods of the wave in the cavity. To come back to ‘photons’ and their definition, if we assume we have a single mode, which has an associated ‘number state’, or , we can give the energy of that state in the form of Eq. (2) [9, 12]. The state is called the ground or vacuum state and still represents a finite energy level, although this cannot be detected without specialist experimentation and does not contribute to photon counting .
The spatial-mode, k, therefore contains ‘photons’, and the angular frequency is, . As we can consider photon ‘creation’ as being the increase in electromagnetic energy in a mode, while a decrease in energy represents photon ‘destruction’, the state can only be excited by integer multiples of , where ħ is the reduced Planck constant, . The energy of a photon is defined using Eq. (3) , where c is the speed of light and is the wavelength.
One particular reason why the single packet of energy has become useful as a conceptual framework, is that the single-photon state can produce a single current-pulse in a photodetector that uses ionisation ( p2). This significant point allows us to use the word ‘photon’ from a detection viewpoint, removing some wave and probability distribution details. If we try to measure the number of photons within a mode, we would find a probability, , of finding a given number. This we will see leads to Poisson statistics. Photon creation, destruction and thermal variation will contribute to the fluctuation in photon number. In Eq. (4), denotes the mean number of thermally excited photons in a field-mode at temperature, , with Boltzmann constant . This is called the Planck thermal excitation function ( p10) and represents the probability that the mode is excited to photons at thermal equilibrium.
If we think of an ensemble of measurements (over time or separate identical systems), the probability, , of finding exactly photons within an optical cavity with a mean photon number, , is given by the Geometric distribution of Eq. (5). This holds only if we consider a period, t, greater than the characteristic time scale of any fluctuations ( p14).
Lasers and the radiative transitions of excited atoms are often treated as ideal. Despite this, there are processes such as collision, power and Doppler broadening that modify the shape of a spectral emission line . These allow the electric field and amplitude of a light beam to vary in a time dependent manner. The time scales of these fluctuations are inversely proportional to the differences in optical frequencies produced by line-broadening processes. Hence a fine line-width laser is more coherent than a chaotic source such as an incandescent lamp. If we consider a continuous wave emitted by an atom of a gas, a collision with another atom will present a random, abrupt change in the phase of the wave ( p84). The mean time between such events is called the coherence time, , and is expressed by Eq. (6), where is the center wavelength, is the spread of the line due to broadening and is the speed of light. For an example 650 nm laser, a spectrum of 1 nm width yields a coherence time of 1.41 ps. For a LED, the spectrum may be wider with a FWHM of 20 nm, giving a coherence time of 0.07 ps.
Expanding this to many atoms, we can write the total electric field amplitude, , as Eq. (7), where represents the amplitude variation and represents the phase variation with time, , of the phasor addition of all constituent radiating atoms ( p84).
Detection can be viewed using semi-classical theory . This is the combination of a classical (i.e. wave) treatment of incident radiation with a quantum (i.e. particle) treatment for the atomic detector. If we call the number of photon counts produced in an experimental integration time, , and repeat the experiment multiple times, we can define a probability of finding photodetections, . The mean number of photodetections, , can be calculated from , where models the detector’s finite efficiency and is a cycle averaged incident intensity. Loudon ( p121) derives , Eq. (8), which has the same form as the Poisson distribution, with the rate parameter, , being equal to .
This applies to light from stimulated emission, i.e. coherent light, but also for chaotic light when the integration time of the measurement is much longer than the coherence time, such that fluctuations in intensity are averaged. There is therefore a continuum between the Geometric and Poisson distributions depending on the integration time and coherence time ( p122). Normally we consider the variance of the Poisson distribution to be equal to the mean, i.e. , however fluctuations in light produces a departure from this assumption. The variance of the distribution is given by Eq. (9), where is the mean intensity that falls on the photodetector during the period from to ( p122].
The first term is photon shot noise and is the primary assumption used for the statistics of coherent light. The second term represents an excess noise linked to “wave fluctuations” of both coherent and incoherent sources. The variance simplifies to Eq. (10) ( p122). If the measurement time is longer than the coherence time, then the second term is negligible and we can assume that , i.e. pure Poisson statistics. If the measurement time is equal to or less than the coherence time, then the second term increases leading to super-Poisson statistics. Sub-Poisson statistics, i.e. squeezed light states are also valid, however this requires a quantum treatment of the electromagnetic field ( p201).
However, for chaotic light, such as a thermal source or a LED, the variance of Eq. (10) simplifies into a different, more Geometric form, given by Eq. (11) ( p122 and p199). Here we see that such sources follow super-Poisson statistics. It is now understandable why lasers with fine line widths are used for (i) physical experiments requiring photon counting and (ii) high speed (100 Gb/s) optical communication links where noise about mean signal amplitudes represents a significant contribution to errors.
3. High-sensitivity optical to electrical conversion
There are three processes that lead to photon interaction, and thus detection . These are the (i) photoelectric effect [11, 15], (ii) Compton scattering and (iii) pair production, with the final two phenomena only occurring at high photon energies.
A photon can be scattered by an atomic electron, a process called Compton scattering. It imparts some energy to the electron, meaning that the photon energy is decreased, and thus the wavelength becomes longer. Compton scattering therefore does not destroy the photon. As noted in , this effect is small for energies “below tens of KeV,” i.e. 10KeV is 0.124 nm and 1KeV is 1.24 nm.
In pair-production, the photon energy is enough to result in the creation of an electron-positron pair, i.e. the anti-matter counterpart of the electron. The energy must be higher than , where is the electron rest-mass. As this is 1.02 MeV, i.e. 0.0012 nm, this process occurs only for X-ray and Gamma-ray interactions .
This elimination leaves the photoelectric effect – which is subdivided into the external and internal photoelectric effects [11, 14, 15, 16]. We also have further subdivision into the photoconductive effect and photovoltaic effect. Due to the prevalence of these key words in peer-reviewed literature [17, 18], legal patents, company white papers and graduate and undergraduate textbooks, they represent the time and geography independent nomenclature for the field. The internal photoelectric effect is certainly not “irrelevant” for solid-state photonic technologies [11, 15, 18].
3.1. The internal and external photoelectric effect
In the internal and external photoelectric processes, a photon is absorbed, thus destroying the photon. If the photon energy, , is sufficient to overcome the surface work function, , of the material (Eq. (12)), a bound-free transition takes place whereby an electron is promoted from an outer electron orbital and is expelled from the surface into an external vacuum (of permittivity, ) [13, 14, 17]. The remaining energy is accounted for by the kinetic energy of the electron as a free particle , whereby it can be accelerated and cause secondary electron emission. This “external photoelectric effect” is utilised within photomultiplier tubes for primary photoelectron generation. The work function, , of Silicon is ~4.6 eV (requiring photons of wavelength shorter than 260 nm, i.e. UV). This prompted PMTs to use metals with lower work functions such as Caesium, Rubidium and Antimony with work functions of 1.95, 2.26, 4.55 eV for wavelengths of 635, 548 and 272 nm respectively. The Schottky effect, whereby an electric field, , can lower the potential barrier between the material and vacuum, , is also key in PMTs  allowing a reduced ‘effective’ work function and thus a longer wavelength detection threshold.
In contrast, semiconductors have a band-gap, , between the valence, (i.e. outer-orbital bound electrons) and conduction electrons, (i.e. a cloud of delocalized electrons) [17, 18]. A photon with energy greater than this band-gap can promote an electron from the valence to the conduction band [16, 17]. As the absence of an electron in a valence state is described as a hole, the “internal photoelectric effect” produces an electron–hole pair [16, 17, 18, 19, 20]. This is a bound-bound or intrinsic transition [13, 15]. The electron is still ejected from the atom; however, it is not ejected from the surface . If two electrodes are placed on the material with a slight potential gradient, or if that potential exists due to a p-n doped junction within the material , the electron–hole pair are separated and drift apart due to their relative charges. With many photo-generated carriers within the material due to numerous incident photons, the bulk conductivity of the material increases [16, 17], allowing a photocurrent to flow through an external circuit. This is the photoconductive mode. Photons of high energy are highly likely to cause band to band transitions, however as the wavelength increases towards a photon energy close to the band-gap, the likelihood of transition decreases, given by the absorption coefficient, . This leads to a long-wavelength cut off, , given by Eq. (13) . For Silicon, this is 1.1 μm, where the absorption coefficient is 1x101 cm−1, whereas at 400 nm it is 1x105 cm−1.
As materials have various band-gap energies, diverse materials can be used to detect different wavelengths (e.g. Silicon 350 nm to 900 nm and Germanium 750 nm to 1.6 μm). Typically, one electron–hole pair is produced per absorbed photon, limiting the quantum efficiency (typically ≤1), and the spectral responsivity. The optically-induced current, , can be calculated using Eq. (14) , assuming a detector thickness much larger than the light penetration depth, . is the incident optical power, is the electron mobility, is the quantum efficiency, is the distance between the contacts, and is the carrier lifetime.
As the average depth of absorption changes with wavelength, the depth of a photodiode p-n junction is chosen to maximise the received photocurrent. The width of the junction is critical in this; however, it also has implications for bandwidth, which is restricted by three phenomena, (i) the capacitance of the junction, (ii) the time delay of carriers generated outside of the junction, diffusing into that junction, and (iii) the drift or transit time, , of the carriers within the junction (Eq. (15)) .
4. Avalanche multiplication gain: high-sensitivity detection
A challenge central to photon counting, is how to apply enough gain such that a single electron–hole pair can produce an appreciable signal for detection. As all electrical readout schemes include thermal noise , the signal, , we obtain from the detection of a single-photon can easily be hidden by thermal noise, . The signal-to-noise ratio (SNR), would be given by: . If we were to use a traditional amplifier to provide a gain, , the amplifier would also amplify the input noise and would likely contribute further noise terms, . This would give an amplified output with a SNR given by Eq. (16), i.e. the SNR would not benefit from amplification.
To provide single photon sensitivity, an ideal amplifier would need to (a) minimally amplify noise, (b) provide a gain to separate the signal from noise sources, (c) have a high bandwidth allowing fast temporal resolution of an event, (d) contribute little extra noise and (e) for many applications would need to be both small and low power. This is where avalanche gain or carrier multiplication becomes critical for modern photon counting applications.
4.1. The avalanche upshot
For a p-n junction, there exists a small built-in potential, , which separates photo-generated electron–hole pairs. The avalanche multiplication process occurs at an increased reverse bias voltage, , in comparison to standard photodiodes ( p317). At this bias, the total energy difference between the p- and n-type regions becomes: . The resultant increase in the electric field, , accelerates a free carrier, labelled 1 in Figure 1, to a kinetic energy, , sufficient to overcome the ionisation energy (band-gap), , of the material ( p79) (Eq. (17), where is the effective carrier mass and is the saturation velocity). The actual values are larger than this minimum at 3.6 eV for electrons and 5.0 eV for holes. Upon a collision between a carrier such as a photoelectron and the Silicon crystal lattice, the accelerated carrier ionises another carrier. An electron–hole pair, labelled 2 and 2′, is generated with those carriers then accelerated by the electric field, causing further ionisation [3, 4]. This process continues exponentially creating an avalanche of carriers within the p-n junction.
The generation rate, , of electron–hole pairs through impact ionisation can be calculated using Eq. (18) . Here and are the electron and hole current densities, while and are the electron and hole ionisation rates, i.e. the number of carriers generated by an ionising carrier per unit distance. The ionisation rates vary for different materials and with the electric field strength. We can view the generation rate as being time varying. At time , both and are zero, i.e. before a photon liberates a carrier. At , there is a photon absorption giving an electron–hole pair. At , ionisation has increased and , giving more carriers at , and therefore a consequent increase in generation rate.
4.2. Avalanche photodiodes (APDs)
A class of detector called an avalanche photodiode (APD) [1, 2, 3], biases the p-n junction such that carrier multiplication achieves a constraint generation rate, and thus a constant gain. Careful biasing and device design is needed to ensure the process does not lead to run-away avalanche and catastrophic junction breakdown. The diode therefore produces a photocurrent dependant on the photon flux, with the avalanche gain, , and the width of the p-n junction, , being highly dependent on the applied bias. The width also dictates the proportion of the electron–hole pairs that are captured with varying depth into the surface.
There are two main noise sources. The thermal noise, , is given by Eq. (19), where is the Boltzmann constant, is the temperature, is the bandwidth of the system and is the parallel combination of the load, series, junction and amplifier input resistances. As avalanche multiplication is inherently random, there is an associated multiplication of the shot noise, , Eq. (20), where is a noise factor associated with the multiplication, and and are the photo- and dark- currents respectively. The multiplication excess noise factor is highly dependent on the ratio of the electron and hole ionisation rates (Eq. (21)) ( p317), where ideally, we should have more electron than hole multiplication. For Silicon, the ratio of may be low at 0.04. For a gain of , this gives an excess noise factor of 2.22.
Luckily, the avalanche multiplication also multiplies the current caused by the internal photoelectric effect, . This is given by Eq. (22), where is the incident optical power in watts and is the quantum efficiency of the photodiode . Avalanche gain can therefore be a highly effective technique for high-sensitivity detector technologies.
4.3. Single-photon avalanche diodes (SPADs) and other technologies
Several dedicated photodiode structures can be biased further into reverse bias, giving higher gains and therefore greater sensitivities. In some cases, run-away multiplication can be used to create specialised diodes called single-photon avalanche diodes (SPADs) or Geiger-mode avalanche photodiodes (GM-APDs) [4, 5]. The Geiger region lies beyond the avalanche photodiode region but before the breakdown of a guard ring that surrounds the device. It is called this because run-away avalanche, with gain factors in the region of , leads to a current pulse behaviour similar to Geiger-Muller tubes. In modern SPADs, a single photon, yielding a single electron–hole pair can produce a sizable avalanche photocurrent, and in well-designed circuits can produce a voltage pulse suitable for standard complementary metal-oxide-semiconductor (CMOS) logic  with both high temporal accuracy (low jitter) and a short duration (5–20 ns).
Several other photon counting technologies utilise the avalanche breakdown multiplication of carriers. SPADs have been fabricated into large arrays in modern CMOS processes, allowing the advantages of high-speed dedicated on-chip logic and complex signal processing. Silicon photo-multipliers (Si-PMs) can also be made by the parallel combination of avalanche currents across a shared load resistance. Si-PMs employ less complex circuitry, reducing the prospects for single-photon imaging, but often allow higher optical fill-factors for physical experimentation. Both APDs and Geiger-mode devices can be manufactured using III-V materials such as InGaAs/InP for single-photon detection over many wavelengths. Other highly sensitive detectors include electron-multiplying charge-coupled-devices (EM-CCDs) which use avalanche multiplication and micro-channel plate (MCP) detectors which use impact ionisation and the release of secondary electrons.
While the historical literature review of this chapter will centre upon semiconductor sensors, photon detection has a long history of using the traditional photomultiplier tube (PMT). Being vacuum tubes, they are large and require high voltages. However, they are renowned for high temporal resolution and can present a large area detector with low noise and high gain. When the noise is normalised against optically active area, PMTs are often a preferred detector in comparison to solid-state solutions. The choice between detectors is of course a product of the applications, with PMTs being unsuitable for high-speed simultaneous rather than raster-scanned single-photon imaging. The history of these devices is covered well in , however three principal references from the 1930s collectively cover the operation principles and early development of the PMT [22, 23, 24]. Upon a photon absorption, electrons are emitted from a photo-cathode via the external photoelectric effect. A potential between the cathode and an initial dynode accelerates the electrons. Upon hitting the dynode, secondary electron emission acts to increase the number of electrons that are accelerated to an iteratively-biased set of subsequent dynodes. Thus, at the final anode an initial photo-electron can produce an appreciable anode current.
5. Single-photon avalanche diodes: principles and early history
The bulk of this chapter and its references provide a full, robust literature review dedicated to the early history of semiconductor photon counting. As p-n junctions and the avalanche process are utilised in avalanche photodiodes, single-photon avalanche diodes and silicon-photomultipliers, the sections below will track the development of such devices. Starting from initial physical studies on carrier multiplication and avalanche, development of p-n junction transistors and investigation of noise sources within these early transistors, we ultimately end at the use of artificial structures. These man-made structures allowed the study of multiplication in a more deterministic manner, eventually becoming used for the detection of ionising radiation and light. For the most part, primary sources will be referenced, with others provided in parentheses, to provide the de-facto literature review for this field, particularly for early historical developments. It must be noted that the explosion in literature from the 1970s to present, naturally restricts the scope of history that can be covered.
5.1. Early history: 1900 to 1939
During the early 20th century, the predominant electrical technology was the vacuum tube. With the rapid expansion of radio technologies there was a demand for electronic amplifiers that were capable of high-gain, high-frequency received signal amplification or the output of high power signals to increase transmission distances. Much work also centred on power rectifiers and tube-diodes, predominantly for power supplies, where demand pushed these towards higher rectification voltages and the handling of increased electrical power throughput. Three developments impacted the field, the first being the discovery of unilateral conduction of crystals (Braun in 1874) contributing to the development of crystal detectors and point-contact diodes. The second was the 1930s advancement of solid-state power rectifiers and diodes formed from Copper Oxide or Selenium. The third, was the use of trace gases within evacuated tubes, modifying the electrical properties of the device.
It is here with trace gas evacuated tubes that the story of electron multiplication and avalanche can be traced back to. In 1901 John Townsend, then at Oxford University, showed that initial ionisation of the gas between the cathode and anode of a vacuum tube via X-rays, lead to an increase in current as the electrode potential difference increased. The hypothesis being that an initial ionisation event lead to the exponential collision ionisation of gas atoms . The experimental setup used by Townsend is shown in Figure 2, with a schematic of avalanche multiplication shown on the right. While he derived a theory for ionisation based upon the free-path between atoms and eventually the ionisation energy of the molecule, he also made use of earlier experiments (Stoletow, 1890), whereby ionisation was triggered by ultraviolet (UV) excitation of electrons from Zinc (i.e. electrons provided via the external photoelectric effect). Townsend also hypothesised, fitting against experiment, that at low potentials gas ionisation would occur only in favourable occasions, i.e. low probability, but that at high potentials the probability of ionisation increased. Townsend demonstrated differences in conductivity based upon the polarity of the potential and the shape of the electrodes, finding diode behaviour he called ‘unipolar conduction’.
In 1903 Townsend extended this analysis to include positive and negative ions and the breakdown or ‘sparking’ potential of gases. Experimentally, UV light was used to initiate ionisation and liberation of electrons from an electrode plate. Townsend showed modification of the breakdown voltage by collision ionisation showing exceptional agreement with his theory . The theory developed here was later used for both DC and AC gain analysis of p-n junction avalanche behaviour and breakdown voltages. The rapid ionisation, prior to pure gas sparking potential, was to become known as a “Townsend discharge”.
This lead directly to the 1913 development of the Thyratron by Langmuir and Meikle. This was an early tube-based run-away avalanche gas tube used for high-voltage power regulation and fast-switching relays [27, 28]. Both a transition from a sharp concentrated breakdown arc into a blue/purple diffuse glow, and a self-maintaining arc suitable for low-frequency rectification were shown. Langmuir developed a theory of electrical conduction in a hard vacuum, without previously assumed positive trace ions, showing a space charge effect that could be reduced in the presence of positive ions from trace gases (e.g. Nitrogen). Langmuir also observed an effect later called ‘bifurcation’ where the I-V curve splits into two traces and can switch between two conductive states. He writes that, “… the current rose steadily, until a potential of about 130 volts was reached. With potentials higher than this, the current would rise to a high value, 0.013 amp. per sq. cm or more, immediately on lighting the filament, and the discharge was accompanied by a strong purple glow. Suddenly, the current fell to 0.005 amp. per sq. cm or less, and at the same time the purple glow the purple glow vanished”.
Rapid gas ionisation, through Townsend discharges, also prompted the 1908 investigations at Manchester University, into the Geiger-Müller tube for the detection of ionising radiation such as alpha particles , eventually becoming a matured tube concept in 1926, although suitable references are in German. Interestingly, Müller was not involved in the initial concept. A number of phenomena were observed that would later become key issues in photon detection. These were, (i) variation of the number of pulses matching previously established probability laws (i.e. Poisson variation), (ii) variation in the pulse height resulting from particle path differences and thus changes in the degree of ionisation multiplication (i.e. avalanche multiplication noise), and (iii) low pulse rate ‘natural disturbances’ in the presence of no alpha-particles (i.e. the dark count rate (DCR) and background radiation). The interesting point here is that both linear-gain and run-away multiplication leading to breakdown were observed experimentally in gases and explained by theory by the late 1920s. This provided a platform for later solid-state semiconductor investigations.
As shown by Nix , multiple researchers had observed external and internal photoelectric effects in several materials during the 1870s to the early-1930s. In combination with contemporary ideas of using such materials as detectors similar to gas discharge tubes but in a robust, solid-state form, Nix noted that Adams and Day found photocurrents when Platinum-Selenium contacts were illuminated (1876), following on from W. Smith’s findings that Selenium was photo-resistive in 1873 . Other materials were tested, such as Diamond, Silver Sulphide and Lead Sulphide, however Copper Oxide was observed to be photovoltaic in 1927 by Grondahl and Geiger. Copper oxide photoconductivity was then highly studied by multiple authors including Walter Schottky in the early 1930s . The history of other photovoltaic and photoconductive studies during this period is given in the 1967 NASA report by Crossley et al. .
During the 1930s, the use of crystals and point-contact diodes increased, although their use was hindered by the mechanical, electrical and thermal variability of the contact formed by an “active rectifying region” and the “cat’s whisker.” Clarence Zener, then at Bristol University, theorised a form of dielectric voltage breakdown in semiconductor solids in which electrons can be excited by an electric field and may tunnel to a higher energy band, thus increasing conductivity . This was based on experimental work by Von. Hippel in 1931. By deriving the rate at which this transition occurs, it was clear the rate was dependent on the energy gap of the material and the applied electric field. Zener and his theory explained both the magnitude of the field as per breakdown observations, and the rapid increase in breakdown. He stated that “Further, the breakdown will occur suddenly as F* (potential) is increased, (transition rate) being increased by a factor of 100 (in our example) when F* changes from 1.0x106 to 1.1x106” . It became clear that this ‘Zener effect’ predominated when a diode’s reverse breakdown was of low order, but that diodes with higher breakdown voltages were often attributable to an avalanche effect similar to Townsend gas discharges. This was key to later findings in the 1950s with respect to noise sources and breakdown effects in early transistors.
5.2. Early solid-state history: 1940 to 1949
During the period of 1940 to 1949, there were two parallel research themes that became critical to both modern technology, and the historical development of photon counting technologies. These were: (i) the progression in rectification diodes and the p-n junction that lead to the invention of the transistor and (ii) the discovery of photo-effects in Silicon and Germanium, the study of these effects and their utilisation for optical applications. In this section, we will begin with diodes and rectifiers as many of the innovations in optical detectors utilised the pure grown ingots, the theory and the progress in solid-state diodes.
5.2.1. Rectification and the p-n junction
The second world war was a direct driver of solid-state point-contact rectifiers by the Allied forces. As noted by Scaff and Ohl in 1947 , then at Bell Telephone Laboratories, there was a renaissance in point-contact diodes both from a mechanical robustness and a frequency perspective. As military applications expanded to include radio-frequency (RF) to intermediate-frequency (IF) super-heterodyne detectors (e.g. 3–24 GHz) for Radar and RF to direct-current (DC) rectification, vacuum tubes became limited by the transit time of electrons and the anode-cathode capacitance. Hundreds of materials were tested in the late 1930s and early 1940s, including zinc-oxide, molybdenum-disulphide and iron-sulphide. However, Silicon was found to have the best overall RF characteristics. In perhaps the first use of Silicon wafers cut from pure ingots that used Bell Labs’ early impurity doping processes (e.g. Boron), it was found that electrical performance was directly related to the processing of the Silicon (grinding, polishing and etching), the doping profile and the mechanical construction of the point-contact diode housing. Military, commercial and academic standardisation enabled significant progress to be made in operating frequencies, power handling, SNR and operational lifetimes in harsh military applications.
While point-contact diodes were being improved, researchers such as Russel Ohl noticed impurities within the crystals could modify its electrical characteristics. In the early 1940s, positive (p-type) and negative (n-type) dopants were explored in samples of Germanium and Silicon . The work undertaken at Bell Telephone Laboratories led directly to advances with p-n junctions and culminated with the patenting of both the point-contact transistor in 1947 (Bardeen and Brattain) and the p-n junction transistor in 1948 (Shockley). The theory, and indeed the history of this development is covered extensively elsewhere in the literature, however much of the theoretical research in the 1940s is discussed in a review article by William Shockley in 1949 . Here concepts such as the structure of the material’s band-gap, trapping of charge carriers, p-n and n-p-n transistor theory and indeed “patch effects” due to cracks and discontinuities due to dust and impurities were discussed. The developments since the mid-1940s – which has led to modern integrated circuits, CMOS processes and indeed integrated sensors (e.g. modern photodiodes) – are well covered in the book “Crystal Fire” by Riordan and Hoddeson .
5.2.2. The p-n junction and Si/Ge photoeffects
While Nix in 1932  had shown that photoconductivity effects had been observed in many semiconductor and insulating materials (also see ), photoeffects in Silicon are often noted as being first observed by Russel Ohl at Bell Labs in the February of 1940. In Ohl’s 1941 patent , a block of Silicon cut from a small, solidified melt was shown to increase in conductivity when a strong light source was incident upon its surface (Figure 3 left). P-N junctions were present in this sample, prompting Ohl to suggest that: “Ingots which are suitable for the production of photo EMF cells, possess a characteristic structure which is visible when the surface is suitably prepared in vertical section”. As he noted striations in the cut Silicon sample, he named these striations “barriers”, suggesting that these are critical to operation of the photo-cell.
At the same time as Ohl’s work, many authors started to observe photoelectric effects, both photo-conductive and photo-voltaic, within p-n junctions and pure Silicon or Germanium . The earliest of these observations and diode structures are noted by Torrey and Whitmer ( p392) as being unpublished datasets from research on crystal rectifiers at the Massachusetts Institute of Technology (MIT) Radiation Laboratory  or military records (Miller and Greenblatt, 1945, US National Defence Research Committee) . However, 1946 saw assessment of carrier velocity using modulated light , (effectively the carrier transit time), new bridge photodiodes with similar sensitivity as Selenium diodes but with far better stability and temporal response  and observations of reverse saturation currents that varied in proportion to optical intensity . Likewise, in 1947 Bray and Lark-Horovitz  continued the paradigm that light quanta matching the material could lift electrons from full energy bands to the conductive band, while Benzer  showed linear with intensity, photoelectric effects in a diode with both a p-n junction and a metal–semiconductor Schottky barrier. In fact, Benzer demonstrated that the overall diode I-V characteristic could be explained by the series combination of the separate junction I-V characteristics . While difficult to verify, the way Shive  utilised quotation marks when noting that Benzer observed a “photo-diode” effect  and the quotation marks used by Benzer himself may indicate the coining of the term as a replacement to “photo-cell” or, as used by , simply Silicon or Germanium rectifiers that happen to show a photo-effect.
In 1949, authors such as Fan and Becker were exploring the theoretical basis of both the photoconductive and photovoltaic phenomena. They noted that by considering the concentrations of conductive electrons and holes, (i) liberated by thermal excitation, (ii) liberated by the internal photoelectric effect and (iii) the probability that carriers recombine, a suitable model could be derived . This model, shown in Eq. (23), uses as the rate of transition of electrons from valence to conduction band due to thermal generation and as the rate of transition under optical excitation. This was shown to fit to experimental values for the open-circuit voltage, .
By August 1949, John Shive at Bell Telephone Laboratories proposed a variant of the photo-resistive cell, calling it a “photo-transistor”  (Figure 3 right). This portmanteau was likely through Benzer’s use of ‘photo-diode’ to describe optically sensitive p-n junctions  and the contraction of ‘transresistance’ into ‘transistor’ at Bell Labs at the time . Using perhaps the first dedicated photo-sensitive structure to explicitly use back-side illumination, Shive showed electrical gain of the optical signal at a reverse biased base–collector junction. This gain term was similar in magnitude to that observed between the emitter and collector in bipolar-junction transistors (BJTs), but with the emitter in this case being charges produced and injected photoelectrically .
6. Early transistor and microplasma history (1950–1959)
In 1950, two parallel strands of research were underway. Firstly, the growth of high-purity Silicon and Germanium ingots  using the lifetime of carriers as a guide to high “crystal lattice perfection”. This allowed the reduction of recombination centres , which were hindering both diode performance and theoretical studies. To achieve this, Bell Labs improved upon the existing Czochralski method of crystal growth , principally due to commercial expansion of solid-state rectification diodes . This was continued by McAfee and Pearson  for transistor optimization. The second strand of the research centred on the continued optical and electrical investigation of p-n junctions formed in Silicon and Germanium. For example, Goucher  measured, using a pulsed light technique, the photon quantum yield of electron–hole pairs. A departure from unary quantum efficiency at short wavelengths lead Goucher to conclude that there was a thin surface region of recombination centres . Elsewhere in the US, in recognition of future infrared sensors, absorption experiments were being carried out by Fan and Becker at Purdue University . Likewise, singular p-n junctions, rather than structures that included Schottky and p-n junctions in series , were being tested at Purdue. These experiments indicated that the junction capacitance, the input resistance of the measurement amplifier and the internal resistance of the unilluminated portions of the diode, were key to maximising photovoltaic voltages .
In 1951, Pietenpol used p-n junctions for both rectification (showing a reverse I-V breakdown characteristic) and optical detection . Along with a unity quantum efficiency, he attributed agreement of experiment and theory as indicative of the “diffusion of current carriers to and from the junction”. For rectification, a high reverse breakdown voltage (1200 V) and bandwidth (200 kHz), demonstrated competitiveness with existing tube-based rectifiers and Bell Lab’s existing point-contact diodes, hence their commercialization and continued research and development. In July 1951, Shockley, Sparks and Teal  presented work on n-p-n and p-n-p junction transistors, including phototransistors. This combined experimental results, furthering junction theories presented in [20, 34]. There are several interesting points from this and Shive’s earlier ‘photo-transistor’ . Firstly, trapping of carriers may fall off as carrier injection increases through a saturation effect, directly impacting the carrier lifetime. Secondly, a p-n hook region discussed by Shive could be utilised to obtain n-p-n photo-transistors that were extremely responsive to light, with quantum efficiencies of 100–200 electron–hole pairs per absorbed quanta . The bi-polar junction transistor became the basis for transistor-based analogue and digital circuitry prior to the routine use of CMOS and field-effect transistor (FET) technologies. Despite the success of bi-polar transistors, phototransistors have remained secondary to photodiodes and avalanche photodiodes due to longer response times and appropriate biasing requirements.
Of course, due to early crystal processing techniques, no junction used in experiments was entirely perfect. As such, research on the variety of breakdown effects began in earnest. McAfee and others at Bell Labs  started by investigating the Zener breakdown of the junctions, extending Zener’s earlier theory to include larger energy gaps. In perhaps the first mention of later avalanche studies, an alternative breakdown mode whereby secondary electrons are produced when the electric field reaches a “critical” value was discussed. This was discounted through experimentation, however “patch effects”, i.e. a prelude to defects yielding avalanche gain, were directly mentioned for future investigations . Complementing this, the magnitude of breakdown was shown to be sharp, with McAfee noting that, “a change of voltage of one-half percent is sufficient to cause the current to change by two orders of magnitude” . The measured slope of the I-V curves however did not fit Zener-Shockley theory suggesting a further multiplication phenomena. Further, the voltage noise was found to significantly exceed thermal noise . The prevalence of recombination to explain optical and electrical behaviours, lead to Shockley and Read’s paper on the statistics of electron and hole recombination , Hall’s paper on the same topic in Germanium , (including carrier lifetime against temperature measurement to evaluate the “activation energy” of in-band trapping centres) and a method of experimentally probing the p-n junction  to obtain capacitance, junction width and voltage dependencies to inform theory. As breakdown, Zener or due to patch effects, restricted the effectiveness of solid-state diodes, Pearson and Sawyer  continued investigation using the Silicon crystals grown at Bell Labs. Several important issues became elucidated including that a built-in potential, , must be incorporated into breakdown theory, and that the I-V curve gradient in the Zener region was larger than theory, which while not yet understood, required investigation. The most important issues however were that ‘noise’ was observed at the Zener knee and that a “softness” of the reverse characteristic was also observed  (Figure 4C). The softness of the knee, defined as an unusual increase in current before true Zener breakdown, was improved by annealing. “Crystal lattice defects”, i.e. patch effects, were cited as a possible cause for this behaviour. The noise at the Zener knee (i.e. bifurcation of the I-V characteristic), showed clipped voltage pulses as high as 3 V. The noise was also temperature dependent, with pulses which were uniformly random. The noise behaviour varied greatly between units and was cited as being caused by mechanical issues within the junctions. This, along with the patch effects noted by McAfee , is the origin of later ‘Microplasma’ nomenclature for localised breakdown defects within such junctions.
The study of p-n junctions for optical detection continued with Shive  forming n-p-n photo-transistors whereby “the photoelectric absorption-activation process” generates holes that diffuse into the p-type region, and if trapped by the potential barriers, act to lower the barrier prompting the increased passage of emitter to collector current. Effective quantum efficiencies of 1000 were achieved with the efficiency,, being given by Eq. (24) , where and are the n- and p- section conductivities, is the diffusion distance of holes in the n-type region and is the width of the p-type sandwich layer.
However, once multiplication i.e. e-h pair production greater than the Zener emission of carriers had been found, many authors begun investigations into the effect using light, alpha-particles and thermal liberation for generation of initial carriers. The reason being that multiplication of photocurrents only, could remove the need for the photo-transistor’s continuous, optically modulated, emitter-collector current. The paper by McKay and McAfee in 1953  is key as multiplication in slightly wider p-n junctions than previously studied , is attributed to an avalanche ionisation effect similar to Townsend avalanche [19, 20]. Indeed in 1967, Emmons  noted that this was the first time that Townsend’s avalanche theory was applied to the direct-current (DC) analysis of p-n junction multiplication behaviour. McKay and McAfee used avalanche multiplication to apply a gain to a photo-generated current, demonstrating increased quantum efficiency as the voltage approached breakdown, i.e. the avalanche photodiode, although the first such device is attributed to Nishizawa in 1952 (patent JP1955-8969A ). Linking back to Pearson and Sawyer , McKay and McAfee attributed the softness at the Zener knee to multiplication, within the junction, of the thermally generated dark-current, while pulsed experiments using alpha-particles showed that avalanche occurred on time scales less than s. Crucially, even in 1953 McKay and McAfee noted that “For wider junctions, the multiplication factor, M may become infinite for fields below those required for field emission” . This paper therefore not only indicates the origins of the avalanche photodiode, but also alludes to junctions for Geiger-mode avalanche devices such as SPADs. The pivotal work undertaken in McKay and McAfee’s 1953 paper prompted more theoretical treatment by McKay  and Wolff  in 1954. Likewise, as p-n junctions as transistors were beginning to be used as fast switches, Kingston  theoretically investigated the switching time, showing dependence on the structure and the minority carrier lifetime. The minority carrier is the opposite to the dominant carrier within a doped region. For an n-type region, the dominant carrier is the electron, hence holes are classified as minority carriers.
In 1955, Miller  showed that avalanche breakdown also occurred in Germanium. Through investigation of the carrier ionisation rates, he presented agreement with Wolff’s theory. Interestingly, it was noted that while breakdown voltages should be static, the multiplication factor could be different depending on if an electron or hole initiated the avalanche. Crucially, Newman  discovered reproducible, defect-correlated light spots of approx. 10diameter within an avalanching junction. Soft breakdown was hypothesised to be due to breakdown of small patches at a spectrum of voltage levels below the Zener breakdown voltage, with the light being due to radiative relaxation of high-energy carriers produced during avalanche. Chynoweth and McKay confirmed this at Bell Labs finding further “localised light-emitting spots”, which were correlated to scratches, defects and the spatial location of the main p-n junction. They called these “Microplasmas” [62, 63], which was to become the de-facto nomenclature during the late 1950s and 1960s.
Herein, researchers began investigation of Microplasma defects as noise sources within the wider p-n junction [63, 64]. The pulse behaviour was studied by Rose  finding long quiescent periods at which the device could be held at a voltage above the breakdown voltage, “but awaits the entry of a chance carrier into the region”. He also found that (a) microplasmas were much smaller than the observed in , (b) that current pulses could equate to a local heating effect, (c) that current pulse duration varied exponentially, (d) that random fluctuation to zero carriers could explain the “turn-off” behaviour, and (e) that an equivalent circuit was an unreliable tool due to variation in the microplasma dimensions. The initiation of avalanche was also of interest, with Chynoweth and McKay investigating the kinetic energy required for impact ionisation in Silicon . There was significant debate and variation in results for this kinetic energy in the literature. While 2.3 eV was speculated by Wolff , Chynoweth and McKay showed that the energy was 2.25 eV for electrons and the lower hole ionisation rates could be explained by an estimated threshold energy of 2.8 eV for holes . Values of 1.5 eV for electrons and 3.5 eV for holes were also proposed (Miller, 1957). In contrast, modern texts point to a minimum theoretical value given by Eq. (17), i.e. 1.65 eV, and measured values of 3.6 eV and 5.0 eV for electrons and holes respectively ( p79).
Between 1958 and 1959, studies split into two domains. The first was the study of single microplasmas, in comparison to uniform junctions that exhibited many such defects. Senitzkey and Moll  achieved this using small area diodes (), with a sharp characteristic rather than the softness observed in some diode I-V characteristics and by initiating a defect by introducing an Aluminium impurity at a known position. The link between dislocations and breakdown was confirmed definitively by Chynoweth and Pearson , although at the time it was not possible to confirm if avalanche at the location was due to increased electric field due to dopant non-uniformity, carrier tunnelling due to traps in the energy bands, band-gap narrowing due to lattice distortion or indeed large crystal misalignments. The microplasmas, bi-stable turn-on turn-off bifurcation noise in the pre-breakdown region and noise pulses from multiple microplasmas found in [68, 71], were then verified by I-V and light emission studies in 1959  (Figure 4D).
The second domain was transient studies, both of microplasma turn-on, turn-off and the AC behaviour of diodes using avalanche. In 1958 Read  devised a high-frequency (5 GHz) microwave oscillator, utilising avalanche multiplication during the positive portion of a sinusoidal input signal. In doing so, he investigated the transit time, build-up and signal frequency response utilising Townsend’s 1901/1903 theory of impact ionisation for AC analysis. As noted in 1967 , Read hypothesised that avalanche build-up time constants will limit the AC bandwidth of such diodes. Avalanche transistors, and their transient behaviour ([75, 76], and references therein) contributed to noise performance investigations of the avalanche process, while Champlin  continued the microplasma bi-stable noise studies of Rose and McKay. He demonstrated that both current and voltage pulses could be modelled in an analytical manner. Under low series impedance conditions (5 Ω), many quantities of his model (probability rate for non-conducting to conducting transitions etc.) could be assumed to be time-independent allowing a Markov model to be used. In high series impedance cases (10 kΩ), time and voltage independence could not be assumed, producing a non-Markovian process, which Champlin noted as departing from previous models by Rose, although for many situations close agreement was found.
The 1950s were a period of significant progress in device design and fundamental research into breakdown behaviour. Inherent in this experimentation was the use of what would now be regarded as passive quench passive reset (PQPR) circuits , or rather, circuits that used a series resistance either as a current sense resistance (low-impedance), or in order to develop an appreciable voltage pulse suitable for counting (high-impedance). Initially, series resistances of 1 Ω were used [65, 67] however the general use of a load resistance to make measurements was quite standard [30, 38, 47]. Depending on the experiment, this resistance increased [59, 66] sometimes to values as high as 10 kΩ  while 50 Ω was used to match the 50 Ω input of test equipment . By the end of the 1950s, artificial, single-microplasma diodes had been used for study of avalanche multiplication and microplasmas as unwanted noise sources in p-n junction transistors. Models had also been proposed for microplasma bi-stability and the avalanche mechanism in semiconductors (e.g. the McKay, Wolff, Rose and Champlin models), and the link between lattice dislocations, doping imperfections and microplasmas had been definitively proven. Light and ionising-radiation applications had been explored, initially as methods of injecting carriers, but studies of photo-transistor and photo-carrier avalanche multiplication showed that such diodes could be important for the optical detection challenges at the time.
7. Artificial microplasmas and early applications: 1960 to 1969
Deep physical investigation of microplasma physics and of course the theory to explain their behaviour continued to be researched as intensely as the previous decade. However, the 1960s can effectively be characterised as the starting period for both applications of the avalanche mechanism and the increased investigation of diode structures. As the number of researchers and open topics increased dramatically in this period, the discussion below will be split into two chronologically-ordered sections. The first will discuss the progression of microplasma modelling and experimental observations, while the second will discuss the evolution of the physical device and the applications to which it was applied.
7.1. Microplasma experimental observations and theories
Based upon earlier experiments [52, 57] and models , McIntyre proposed an extended microplasma model . This was tailored to linear and step junctions upon which most observations had been made, in comparison to the p-i-n junction . Deriving the turn-off probability, McIntyre conjectured that turn-off is due to the chance fluctuation to zero of carrier-pairs, thus preventing ionisation. Investigating turn-on mechanisms, he proceeded with a photomultiplier analogy yielding Binomial and Poisson theories, and an election candidate ballot box analogy. Despite some correlation, each departed from experiment.
While optical emission investigations continued, it was suggested in  that there were four classifications depending on if microplasma pulses, light and multiplication were observed. However, at least one classification was in doubt. This was the combination of microplasma pulses without multiplication, however this may have been due to measurement methodology issues , as carrier multiplication was suggested elsewhere [62, 63]. Two categories were suggested by Goetzberger and Stephens , those with (i) bright light emission but low breakdown voltage and (ii) dim light output and high breakdown voltage. Disagreement with results in  were attributed to non-observable light emission due to the depth of some junction defects. Goetzberger and Stephens concluded that microplasmas were preferentially located at lattice dislocations, but this may not be the causal factor. They also concluded that microplasmas are, “caused by some kind of imperfection that itself has a statistical distribution of its properties” .
Haitz and Goetzberger  proposed an improved method of investigating multiplication within microplasmas, refining experiments in . Indeed, multiplication can occur within microplasmas (up to an observed ratio of ), refuting the classifications in . To continue efforts in classification, two kinds of avalanche were noted, the first through microplasma action, and the second through entire-area avalanche breakdown observed in custom fabricated “guard-ring” diodes by Batdorf et al. . Proposing a theory for multiplication, Haitz and Goetzberger note that rather than continuous multiplication, photon arrivals cause a microplasma to turn on again, thus multiplication is by virtue of an increased time in which the microplasma is conducting. They thus relate microplasma photocurrent multiplication to an ionisation counter, re-affirming Ruge and Keil’s 1963 link between avalanche gain, current pulses and existing Geiger-Muller detectors  (see later).
Exploring avalanche breakdown with a microplasma-free junction , three interesting phenomena were discovered. Firstly, a theory in which statistical variation of donors and acceptors in the junction (Shockley, 1961) leads to non-uniformity in breakdown voltage and thus the avalanche breakdown of the whole area, was supported by experiment. Secondly, striations were observed through light emission which correlated to distinct annular non-uniformities in grown wafers. Thus, the diodes were not truly “uniform”, prompting further work on crystal growth and general p-n junction regularity. And thirdly, the pulse multiplication model in  was verified for high (>500) avalanche multiplication factors.
In 1964 and 1965, Haitz published two influential papers on the electrical behaviour and noise contributions of microplasmas and the avalanche mechanism [79, 80]. By proposing an equivalent circuit, Haitz modelled the phenomenological rather than actual nature of the microplasma (Figure 5A) . This uses an internal resistance, , in series with a bi-stable microplasma switch, , and a breakdown voltage extrapolated from the common multiplication vs. voltage curve, (Figure 5B). Haitz derived the current and voltage forms for on–off transients, giving the now standard view of the breakdown-quench-recharge cycle (Figure 5C). To quote , “As long as the microplasma is nonconducting, the diode capacity, , is charged to the applied voltage, . As soon as a carrier triggers an avalanche, the microplasma switches on to a current, , This turn-on is very fast and is estimated by Senitzky and Moll  to be of the order of sec. Due to the voltage drop across the capacity is discharged and voltage and current drop simultaneously to the operating point given by the intersection of the V-I characteristic and load line”. In , Haitz discusses four principal noise contributions, (i) thermal carrier generation, which is now known as dark count rate (DCR) noise, (ii) re-emission of carriers trapped during previous avalanches, i.e. after-pulsing, (iii) Zener/Shockley band-to-band tunnelling and (iv) minority carrier diffusion from elsewhere in the substrate, triggering an avalanche (see also Tager, 1964). Continuing his studies on noise, Haitz investigated an optical cross-talk mechanism [81, 82], although the re-absorption of light emitted by radiative recombination during avalanche had been discussed by Newman in 1955  and Champlin in 1959 . This supplemented the coupling experiments conducted by Ruge and Keil in 1963 , along with Conradi (1963), where the distances between and non-clustering of triggered microplasmas precluded thermal phenomena, thereby giving credence to Haitz’s 1962 hypothesis of optical coupling. Haitz  fabricated a wafer of over 100 artificial microplasmas (discussed later), using a diode with a background count rate of approx. 1 pulse/sec (at ). Through experiment, minority carrier and lattice phonon  mechanisms were discounted for distances of , as the coupling was still present between separate Silicon slices. Interestingly, an analysis by Ashkin and Gershenzon in 1963, of the refractive index of the space charge layer in a p-n junction in comparison to the bulk Silicon suggested a light waveguide which was denoted as a “pipe” . For closer distances, minority carriers were proposed as a mechanism whereby the avalanche spread laterally .
While the pulsed mechanism was of interest for light and gamma detection , continuous time gain used in APDs still required examination. In 1964 Lee, Logan et al.  reinvestigated the kinetic energy ionisation rates for electrons and holes as there had been several inconsistences shown between previous work and more recent analysis by Baraff in 1962. In particular, a simplified analysis was combined with a refinement in the cleanliness of test diode growth and the use of a local multiplication uniformity rather than uniformity of emitted light approach. They noted that better agreement to an ionisation energy in the range , could be due to microplasma-free junctions and a method that allows purer injected hole currents. Analysis of avalanche as applied to APDs continued in 1966 with several critical papers. McIntyre  concentrated on inferring the SNR that could be obtained for applications requiring high photodiode gain. The noise of the process was shown to increase with the cube of the multiplication factor, , however McIntyre noted that if most of the carriers entering the high-field region have a high ionisation rate, the noise factor decreases. Melchior and Lynch  commented that multiplication in the diode is limited by its noise in comparison to receiver noise. Baertsch , showed discrepancies with McIntyre’s noise theory, particularly (a) reduction of noise if the primary photocurrent in a Silicon APD was due to holes, and (b) increase in noise greater than McIntyre’s theory at high multiplication values. This departure was blamed on electronic noise or the differing ratio of electron and hole ionisation rates. One of the complications in all experimental studies within the 1960s, was the handling of diode structure, doping or bulk-material induced changes in characteristics, and the associated departures from theory. Indeed, with a different diode structure Baertsch  presented significantly better agreement to McIntyre’s theory, highlighting the difficulties in obtaining reproducible results. Sze and Gibbons  re-calculated the ideal (microplasma-free) breakdown voltages for both abrupt and linearly graded junctions for several bulk materials including Silicon, Germanium and alternative III-V alloy diodes. This allowed other researchers to estimate the departure of their devices from the ideal, entirely uniform breakdown characteristics. This was used as a baseline for investigation into junction curvature in fabricated planar diodes , continuing the work by Gibbons and Kocsis in 1965. The results showed that for abrupt junctions, the radius of curvature, significantly impacted the breakdown voltage through the modification of the electric field intensity. If the curvature was equal to the junction depth, this produced a more marked dependence. The breakdown in this region was always lower than that in the planar region, producing edge breakdown effects and structural dependence on results. In comparison, linearly graded junctions had only a small dependence on curvature.
Towards the end of the 1960s, theoretical work returned to the question of the avalanche mechanism , noise , transient and frequency behaviour  and the avalanche turn-on mechanism . In  the existing models of avalanche breakdown were extended. The variation in multiplication was stated to be due to differences in the electron and hole ionisation rates, and not spatial non-uniformity of the electric field within the junction. In 1967, Haitz  provided extension and experimental verification of an avalanche noise theory by Hines . Here, (i) assumption that the avalanche region is small in comparison to the total drift region, (ii) an assumed power law for the ionisation rate voltage dependence and most importantly, (iii) neglection of the spreading and thermal resistance of the bulk material, were each accounted for via correction factors. Haitz observed peaks in the noise spectrum with avalanche current, attributing these to variation in doping  as per Shockley (1961) and not microplasma action. Consequently, he updated the noise model to include noise generated by a Poisson spatial distribution of dopant impurities within the diode, and reiterated the lateral spreading mechanism .
By the middle of 1967, Emmons noted  that depending on the ratio of electron and hole ionisation rates, there need not be a reduction in the bandwidth of a diode due to avalanche multiplication, (as proposed by Read  in 1958 and Lee and Batdorf in 1964). These earlier works assumed that: (i) the electron, and hole ionisation rates were equal, (ii) the velocities were also equal and (iii) that Maxwell’s time-varying electric field, polarisation “displacement current” could be neglected. Through analysis, Emmons showed, (using equations for DC  and AC  cases of Townsend gas discharge ), that if the avalanche multiplication was kept below the ratio of the ionisation rates, i.e. that , then bandwidth was not dependent on avalanche multiplication and indeed that noise was minimised. His application at the time was to find the conditions that would “produce the closest solid-state approximation to the vacuum-tube photomultiplier”.
In 1968, Nishizawa already credited with the invention of the APD in 1952 , was working with Kimura on microplasma turn-on mechanisms . Treating turn-on as a stochastic phenomenon, they modelled it using a 2-state Markov process, concluding that the turn-on probability was a strong function of field intensity in the p-n junction and the rate at which carriers are generated near the junction. This turn-on probability was then utilised by Melchior and Goetzberger in 1969 to form a “gating” or “quenching” technique using a sinusoidal excess bias waveform . Thus, by the end of the 1960s, many facets of the theory of microplasmas in Silicon and Germanium, and the avalanche multiplication process had been confirmed and informed by experimental observation.
7.2. Junction and artificial microplasma structures and applications
To support both theoretical and experimental work (previous sub-section), and to investigate applications for such high-sensitivity photodiodes, significant effort was made in the 1960s to remove microplasmas from p-n junctions. This required innovation in planar technologies (dopant implantation and diffusion, masking and etching etc.), and investigation of diode structures (topologies, guard-rings and substrates etc.). Eventually, applications emerged using true microplasma-free uniform avalanche multiplication, and the technologies surrounding APDs and GM-APDs became more mature.
In July of 1960, Batforf and Chynoweth et al.  proposed the use of a planar “guard ring” that surrounds the active area (also see ). The reasoning being that if there were no dislocations within the junction, and if doping was uniform, then the periphery of the junction would be the next preferred breakdown site . Their diode, shown in Figure 5D, used a lightly doped p-type region () around a known diameter circular n-type area diffused into a p-substrate. While the nomenclature did not spread, they called this diameter deterministic test structure, a “Macroplasma.” This was a step towards planar technology although it incorporated some surface etching similar to previous mesa (i.e. table) structures (diameter) . Goetzberger and Stephens  note the use of open tube systems for diffusion, a multiple predeposit-wash technique and guard rings for uniform doping and minimal surface defects. It is difficult to ascribe guard rings to particular authors as Senitzky and Moll  used an effective virtual guard ring formed by removal of surrounding Silicon (forming their mesa structure), but inducing surface issues. Modern processing often forces device designers down either a planar or mesa structural path.
At this point, many structures were fabricated to study uniform breakdown, however in 1963, Ruge and Keil  set about using microplasmas and avalanche gain for gamma radiation detection. They compared the voltage-pulse output and the linearity with incident radiation as equivalent to Geiger-Muller radiation detectors (re-iterated in ). Despite their application, at this point (a) gamma detection in Silicon had been proposed albeit with low signal amplitudes that necessitated high-gain, low-noise amplifiers, and (b) alpha-particle radiation had already been used for the study of ionisation rates in the avalanche process . Also in 1963, Goetzberger et al.  noted several technical processing advances that were required for minimal- or zero- microplasma densities in fabricated diodes. Finding that their process lead to microplasmas originating at surface effects, electrolytic polishing was utilised in the reverse process of common electroplating. The material deposition technique was also refined using Helium as a carrier gas as more reactive gases lead to the formation of undesirable precipitates and a phenomenon called surface pitting. The multiple pre-deposit and washing technique was again suggested as depositing phosphorus lead to a glass being formed (SiO2 and phosphorus pentoxide), which hindered diffusion into the substrate and promoted non-uniform doping.
Between 1964 and 1965, significant device and application progress was made on avalanche photodiodes. In , Johnson noted significant SNR and signal amplitude enhancement in uniform breakdown APDs. This was then extended by Johnson in 1965  noting that at high multiplication factors (high-M) and light levels, the SNR became dominated by shot noise, while thermal noise dominated the low-light, low-M case. Johnson therefore suggested that for modulated optical signals, the modulation depth must be large to maximise receiver SNR. While Johnson was at Texas Instruments Inc., researchers including Anderson and McMullin (under the supervision of Goetzberger) continued work at Bell Labs on microplasma-free APDs at microwave frequencies . Testing custom n+-n-p diodes (which incorporated n-type guard rings, (see ), at frequencies up to 10GHz they noted that, (a) electrons have an advantage in terms of ionisation coefficient, although the bandwidth becomes limited by the electron diffusion time, and (b) the SNR becomes limited by photon shot noise. Their tested diode structure is shown in Figure 5E with an n+ to p-type junction. Melchior and Anderson  then noted that an optimum SNR could be obtained if the “multiplication is such that the multiplied shot noise is just equal to the sum of the series resistance and receiver noise”. They warned that M-factors greater than this may improve optical sensitivity, but the SNR remains dependent on receiver noise.
As previously mentioned, Haitz fabricated an array of over 100 diodes , using a n+-p active junction and a wide radius n− to p-substrate guard ring. Firstly, this came at a time when arrays of diodes were being investigated for solid-state imaging by Schuster and Strull (October 1965) . This is often cited as the first photodiode array, however the Haitz array, (which did not include readout circuitry suitable for imaging), was published in the April of that year. Secondly, Haitz used avalanche and microplasma pulses as a direct equivalent of modern SPAD arrays and Si-PMs, whereas Schuster and Strull utilised photo-transistors equivalent to modern CMOS image sensors with in-pixel electrical amplification. Thirdly, the depth of the guard ring diffusion used by Haitz, allowed a deep region under the junction where any minority carriers would be quickly absorbed in the n− regions, thus reducing the diode background pulse rate.
In January of 1966, Lee and Batdorf  presented an overview of research on avalanche multiplication, the time dependence of AC-signals and the recent technological developments for applications such as high-speed APDs and the Read microwave oscillator . They also noted that efforts to remove the guard ring, but to still constrain avalanche using oxidisation and surface treatment techniques, had proved fruitless. To this day, guard rings are still a requirement but can present a fill factor issue, thereby limiting optical sensitivity. Respectively in February and June, Huth at General Electric  and Haitz and Smits at Texas Instruments in collaboration with Bell Labs , published results, noise analysis and application discussions for Germanium APDs for x-ray, alpha-particle and IR-optical detection. In Europe, Keil and Bernt  were investigating microplasmas in Silicon for infrared detection, building upon previous infrared absorption experiments . In the December of 1966, Melchior and Lynch  again addressed signal and noise responses in APDs at modulated speeds up to 10GHz, building upon . As light detection using the increased pulse rate of a microplasma includes turn-on turn-off noise, Melchior and Anderson suggested large area, uniform avalanche detectors as of primary interest for high (but not single-photon) sensitivity. As such, they proposed an amalgamation of the mesa structure and the planar guard ring in an attempt to reduce the reverse current and device noise, while promoting uniform multiplication across a diameter optically active area. They thus reduced the curvature of the junction, minimising edge breakdown , while protecting the junction from edge/surface effects .
A highly pertinent question for applications research at the time was how to optimise these avalanche photodiodes. Ruegg at Stanford Electric Laboratories  proposed design parameters, a diode structure and manufacturing processes for an optimised device. This was an initial “reach-through” APD. While presenting the structure, he stated that an optimised diode can be formed by using a depletion layer that reaches through the device to the illuminated surface, and with a depletion layer width that is approximately equal to the penetration depth. Such devices are still used and have been further refined [3, 4, 5]. While investigation and improvement of manufacturing processes was continuing to remove microplasmas, it was clear that to improve both performance and yield, circuit and electrical innovation was also needed. In 1967, a bias voltage mechanism called “AC-pumping” was proposed . This was in fact the first use of a “gating” signal, whereby microplasmas and avalanche are suppressed electrically. When the gating signal is high, the bias is above breakdown allowing avalanche gain upon the reception of a minority carrier. When the gating signal is low, the diode does not exhibit microplasma pulses. When gating at high-speed (), the diodes can be significantly improved, including diodes which previously showed correct microplasma-free uniform avalanche behaviour. This effect was studied in greater depth in 1969 , where the nomenclature of a “quenching technique” was possibly coined for the first time with respect to man-made avalanching junctions. The square- and sinusoidal-wave gating techniques are still utilised, often in quantum key distribution systems where noise can be suppressed and performance can be increased .
Following from Sze and Gibbons’ work on junction curvature , Kao and Wolley , and separately, Sigmund  proposed structures optimising the junction for spherical or circular edges. In  Y. Kao and E. Wolley investigated guard rings with both (i) different spacing between the active junction edge and guard ring, and (ii) multiple concentric guard rings. Finding that this reduced surface breakdown, they also found that it improved junction curvature. In , Sigmund used an alloying technique to back-fill mechanical depressions in a n-type Silicon wafer. The annealed, alloyed structure formed was a back-side illuminated cone on p-type Silicon within the n-type wafer. The tip of this cone, orientated towards the illuminated side, formed a spherical junction, with a radius of curvature of , with a depth from the illuminated side of and an active region diameter of approx. . While this was a novel diode structure, Sigmund also proposed a capacitive readout, passing only higher frequencies to a buffer that separated the diode circuit from the impedance of testing equipment. While not covered here due to the explosion in publications per annum in this field, the 1960s represented the drive of physicists and engineers in the 1970s and 80s to explore device structures for high-performance optical detection applications. Many of the authors above were supported through companies such as Bell Labs, Texas Instruments (TI), General Electric (GE), International Business Machines (IBM) and Shockley Transistor. This highlights the commercial drives for solid-state, high-gain, high-bandwidth photodiodes.
This chapter has introduced photon detection and some background into the nature of the ‘wave’ or ‘particle’ we wish to detect. Discussing three viewpoints for the question of: “What is a photon?” we have linked this with phenomena that allow the detection of photons: the internal/external photoelectric effect, Compton scattering and pair-production.
This chapter has also discussed the historical development of avalanche multiplication, with a focus on solid-state semiconductors detectors. Two classes of photon-counting detectors have resulted, (i) linear gain devices such as APDs and EMCCDs, and (ii) Geiger-mode devices such as SPADs and Si-PMs. The impact ionisation mechanism allows an initial electron–hole pair, generated via internal photoelectric effects, to be multiplied into many current carriers. We have seen that acceleration by an electric field imparts kinetic energy onto an initial carrier, where, upon a collision with the semiconductor lattice, the energy is sufficient to promote an electron from the valence to conduction band. John Townsend hypothesised (1901) that breakdown in a vacuum tube containing trace gases, could be explained by the exponential ionisation of the molecules. Developments in solid-state rectifiers later lead to the photodiode (1940), the p-n junction and the transistor (1947). Naturally, there were a variety of non-ideal behaviours to investigate, with junction breakdown being found to sometimes be localised at crystal defects or regions with different doping levels. These “Microplasma” defects, were hypothesised to involve impact ionisation prompting both (i) methods to remove them and (ii) direct utilisation of the mechanism to provide electrical gain (1960s).
History illuminates the path of science and engineering allowing us to see what has been attempted by previous researchers. By citing primary sources, this chapter aims to provide a literature review for the period of 1900 to 1969, i.e. the early history of modern photon counting technologies such as single-photon avalanche diodes and silicon photo-multipliers. As the field has grown exponentially, and has widened with respect to technologies, modern developments (1970 to present) and other gain mechanisms require similar historical studies. However, we have included several recent review articles and texts that point to modern trends and performances [1, 2, 3, 4, 5, 17, 19]. To finalise this chapter and allow the reader to follow the literature into the 1970s to 2000s period, (i) a reference from 2003 showing the beginnings of integration of SPADs with planar integrated circuits in CMOS technologies , and (ii) a 2007 two-part technical review of operation principles, features and electronics for SPAD arrays [119, 120], are provided.
- The acronyms: (i) CERN, (ii) ATLAS and (iii) CMS refer to (i) The Conseil Européen pour la Recherche Nucléaire, (Geneva in Switzerland), and the high-energy particle physics experiments (ii) The A-Toroidal Large Hadron Collider ApparatuS and (iii) The Compact Muon Solenoid respectively.