Open access peer-reviewed chapter

Computational Methods for Photon-Counting and Photon- Processing Detectors

Written By

Luca Caucci, Yijun Ding and Harrison H. Barrett

Submitted: 23 May 2017 Reviewed: 02 November 2017 Published: 20 December 2017

DOI: 10.5772/intechopen.72151

From the Edited Volume

Photon Counting - Fundamentals and Applications

Edited by Nikolay Britun and Anton Nikiforov

Chapter metrics overview

1,598 Chapter Downloads

View Full Metrics


We present computational methods for attribute estimation of photon-counting and photon-processing detectors. We define a photon-processing detector as any imaging device that uses maximum-likelihood methods to estimate photon attributes, such as position, direction of propagation and energy. Estimated attributes are then stored at full precision in the memory of a computer. Accurate estimation of a large number of attributes for each collected photon does require considerable computational power. We show how mass-produced graphics processing units (GPUs) are viable parallel computing solutions capable of meeting the required computing needs of photon-counting and photon-processing detectors, while keeping overall costs affordable.


  • photon-processing detectors
  • maximum-likelihood estimation
  • GPU
  • parallel processing
  • gamma-ray photons
  • charged particles

1. Introduction

In broad terms, detectors used in imaging can be classified into a small number of categories depending on their working principles. These categories include integrating detectors, pixelated photon-counting detectors as well as a new class of detectors, which we refer to as photon-processing detectors.

An integrating detector measures charges accumulated at each pixel location. These charges are induced by light impinging on the detector and are proportional to the average number of photons incident on each pixel. Dedicated circuitry reads out these changes and converts them to numbers roughly proportional to the charge accumulated at each pixel.

A photon-counting detector works by counting the number of photoelectric interactions observed during the exposure time. Count registers associated with each pixel are read at the end of the exposure time, thus making the output of photon-counting detectors a collection of pixel counts.

A photon-processing detector may use any existing detector technology to measure several “attributes” for each photon entering the detector. Attributes include the photon position, its direction of propagation and the amount of energy it deposited in the detector. This is accomplished by reengineering the detector design so that additional information can be extracted from raw unprocessed data. Important aspects of any photon-processing detector include the algorithm used to estimate photon attributes from raw data as well as how these attributes are represented and stored in the memory of a computer.

Particle-processing detectors are a variation on photon-processing detectors and are designed to detect charged particles, such as alpha and beta particles. Particle-processing detectors enable a new imaging technique—called charged-particle emission tomography (CPET)—which attains high-resolution 3D imaging in living organisms so long as accurate estimation of parameters for each charged particle is available.

This chapter is organized as follows. Section 2 provides an overview of detectors suitable for photon counting and photon processing. Maximum-likelihood estimation (MLE) and its properties are discussed in some detail in Section 3. The next section—Section 4—introduces graphics processing units (GPUs) and the compute unified device architecture (CUDA) programming model. Section 5 presents algorithms for photon-counting detectors, while Section 6 discusses photon- and particle-processing detectors and presents fast GPU-based algorithms for maximum-likelihood estimation of photon parameters. Finally, Section 7 summarizes this chapter and discusses possible applications of photon-processing detectors.

A portion of this chapter has been adapted from Y. Ding, “Charged-Particle Emission Tomography” [1].


2. Detectors for photon counting and photon processing

2.1. Gamma-ray cameras

Gamma-ray cameras are used in nuclear medicine to image gamma-ray photons emitted by radioactive elements. The first gamma-ray camera was developed by Hal Oscar Anger in 1957 [2]. Anger’s original design, often referred to as an “Anger camera,” is still widely used today. A diagram of an Anger camera is provided in Figure 1.

Figure 1.

Diagram of a typical gamma-ray camera (adapted from [3]).

An Anger camera includes a scintillation crystal, a light guide and an array of photomultiplier tubes (PMTs). When a gamma-ray photon interacts with the scintillation crystal, a burst of visible-light photons is produced. Some of these photons propagate through the crystal and the light guide and enter one or more PMTs. When a photon enters a PMT and interacts with it, a measurable electrical signal in the form of a narrow current pulse is produced. This pulse is transmitted to amplifying electronics, so that it can be analyzed. A transimpedance amplifier converts the current pulse to voltage. A shaping amplifier further amplifies the signal and reshapes it, making it broader and smoother. A broad signal is easier to sample via an analog-to-digital converter. The output of the analog-to-digital converter can be scaled to obtain an integer number representing the number of photons entering the PMT. Digitized samples collected from each of the K PMTs are then scanned for events. Detected events are stored in the memory of a computer in the form of scaled PMT samples g1,,gK.

Detailed analysis of the physical processes that take place inside the scintillation crystal and each PMT allows us to derive a statistical model for the scaled PMT samples g1,,gK produced by a gamma-ray camera with K PMTs. Because of noise, PMT samples g1,,gK can be thought of random variables. If we normalize each PMT signal by the gain of the PMTs and we ignore the noise in the gain, random variables g1,,gK can be shown to be conditionally independent and to follow Poisson statistics with means, respectively, g¯1RE,,g¯KRE [4]. Thus, we can write:


Functions g¯1RE,,g¯KRE are called mean detector response functions (MDRFs), and they describe the mean detector response upon detection of a gamma-ray photon with energy E at location R.

2.2. Semiconductor detectors for charged particles

Semiconductor pixelated detectors can be used to measure position and energy of charged particles, including alpha and beta particles. One possible detector configuration consists of a layer of semiconductor material (which we refer to as the “active volume”), a set of anodes placed on one side of the detector’s active volume, and some data-processing circuitry (such as application-specific integrated circuits or ASICs) that measures the anode signals and converts them into digital signals.

When a charged particle enters the detector’s active volume and deposits some of its energy, electron-hole pairs are produced along the particle’s track. The electrons and holes drift in opposite directions under an electric field applied throughout the detector’s active volume. This process is accompanied by the production of electrical charges, which are collected by electrodes on one side of the detector’s active volume. These charges are then converted to digital signals (e.g., number of electron-hole pairs produced) and are either sent to a computer or accumulated in count registers to form an image.

An example of a semiconductor pixelated detector for alpha and beta particle is the Medipix2 sensor (Figure 2) developed at CERN [5]. The Medipix2 sensor features an array of 256 × 256 square pixels of size 55 μm. The counter in each pixel of a Medipix2 sensor can record the duration of an event that is above a threshold, from which the energy collected at each pixel and the particle’s residual energy can be measured.

Figure 2.

Diagram of the Medipix2 chip sensor (

A statistical model for the data produced by a semiconductor detector for charged particles (such as the Medipix2 sensor) must take into account the so-called charge sharing effect [6] as well as many variables, including particle’s position R and energy E, its angle of incidence (denoted as the unit vector s) and bias voltage Vbias applied across the semiconductor. Some recent results for the Medipix2 sensor have been reported in [1, 7]. When a highly energetic particle enters the detector, a large number of charges will be collected at its electrodes. In such a case, the statistics of pixel outputs g1,,gM (where M denotes the number of detector pixels) conditioned onR, E, s and Vbias approach Gaussian statistics, and we can write:


in which g¯mREsVbias is the mean of the mth pixel and σmREsVbias is the standard deviation of gm.

2.3. Intensified charge-coupled detectors

A charge-coupled detector (CCD) is a semiconductor device that produces a pixelated image by converting incoming photons into electric charges, which are then stored at each pixel location. These charges are induced by photons with energy exceeding the semiconductor bandgap. The most general form for the mean output g¯m (calculated by imaging the same object over and over again) is [8]:


in which m varies from 1 to the total number of pixels M, ηmREs is the quantum efficiency at pixel m, point R on detector face, photon energy E and along direction s. The function LREst is the spectral photon radiance at point R for photon energyE, time t and along direction s [8, 9]. In Eq. (3), the spatial extent of the detector was denoted as “det” and “hemidΩ” means integration over all the possible directions s incident on the detector. Finally, integration over the time variable t starts at time t=0 and ends at time t=T.

An intensified charge-coupled detector (ICCD) uses an image intensifier (such as a microchannel plate (MCP)) to amplify scintillation light before imaging it onto a CCD sensor. The image intensifier provides optical gain (in the range from 105 to 106 or more) so that low-energy optical photons (emitted, e.g., upon interaction of a charged particle with a scintillator) can be imaged with practically any CCD sensor. Lenses, usually placed between the image intensifier and the CCD sensor, reimage the image intensifier’s output window on the CCD sensor. Examples of intensified charge-coupled detectors include the iQID sensor developed at the University of Arizona by Brian W. Miller [10].

A proper statistical model for an intensified charge-coupled detector must consider both the statistics of the output produced by the image intensifier and the statistics of the data produced by the CCD sensor. To find a model for the image intensifier, we begin by noticing that each point in the CCD sensor can be propagated back through the lenses all the way to the entrance face of the image intensifier. Therefore, we can consider the number of photons pm impinging on the image intensifier at locations that fall within pixel m on the CCD sensor. Under broad conditions, we can show that pm obeys Poisson statistics and we denote the mean of pm as p¯m. For large enough p¯m, the statistics of pm are approximatively Gaussian.

A general expression that relates p¯m to the sensor output g¯m takes the form:


in which A¯ denotes the mean of the image intensifier amplification (gain) A . The variance, σm2, of g¯m is related to p¯m and the statistics of A as follows [7]:


in which σA2 is the variance of the amplification A andσread2 denotes the variance of the sensor’s readout noise. If the blur introduced by the image intensifier and optics located between the image intensifier and the CCD sensor is smaller than the size of a sensor pixel, then output gm is independent on gm for any mm. If we further assume that the amplification A and the readout noise also obey Gaussian statistics, we can write [1, 7]:


3. Maximum-likelihood estimation

Maximum-likelihood estimation (MLE) is a statistical method that uses observed noisy data to estimate model parameters. For a good historical treatment of the concept of maximum-likelihood estimation, the interested reader can consult [11]. Given a set of observed data and an underlying model (which depends on some unknown parameters), MLE calculates the values of the parameters that better explain the observed data. The observed data that are used for maximum-likelihood estimation are realizations of random variables. Thus, parameters we estimate from these data are realizations of random variables as well.

Maximum-likelihood estimation can, in principle, be used with all the detectors discussed above. For example, we show how maximum-likelihood estimation is used to estimate position of interaction from PMT data, and we discuss an efficient parallel algorithm for it. Moreover and as we argue in Section 6, maximum-likelihood estimation is the estimation method of choice for photon-processing detectors.

3.1. Mathematical description

Let us denote the parameters we want to estimate as the vectorθ. The model itself is characterized by a probability density function (PDF), denoted as prxθ). We use the vector x to refer to the complete data, while we denote the incomplete data as y [3, 8]. We stress that we do not directly observe x, but only indirectly and through the vector y. Vectors x and y are statistically related via the PDF pryx). Probability density functionsprxθ) and pryx) allow us to write


in which pryθ) is the PDF of the observed data y given the parameter θ. Eq. (7) above takes into account two separate “mechanisms” that, when concatenated, produce a sample y from the value ofθ. The first mechanism produces the complete data x according to prxθ), while the second mechanism samples pryx) to produce the incomplete data y.

MLE solves the estimation problem by finding the vector θ that maximized the likelihood Lθy for observed data y. Mathematically, this concept is formalized as:


in which the “hat” symbol denotes an estimated quantity, and we have defined the likelihood as:


Likelihood Lθy has to be interpreted as a function of θ for fixed (measured) y. In Eq. (8), we used “argmaxθLθy” to denote the value of θ that maximizes the likelihood. Becausey is the result of a noisy measurement, the actual value of y in Eq. (8) will change if the measurement is repeated. In other words, y is a random quantity, and this implies that the ML estimate θ̂ML is random as well.

Alternatively,θ̂ML can be calculated by rewriting Eq. (8) as


in which we have introduced the log-likelihood [8]


Because the logarithm is a strictly monotonic function, the expression in Eq. (10) is equivalent to the one in Eq. (8). Often, the log-likelihood θy is numerically easier to calculate with a computer than the likelihood Lθy.

3.2. Properties of ML estimates

Maximum-likelihood estimates have many desirable properties. Some of these properties are summarized below.

  • Asymptotic efficiency. Ify represents a set of repeated independent and identically distributed samplesy1,yM, asymptotic efficiency of MLE implies that, as M increases, the variance of each component of θ̂ML converges to the smallest possible value, which is given by the Cramér-Rao lower bound [12, 13].

  • Functional invariance. Assume the ML estimate of a parameter vector θ is θ̂ML and consider a function uθ of θ. We can identify uθ with the parameter vector μ, and we can consider a maximum-likelihood estimate μ̂ML of μ. Then [14]


    This equation shows that the property of being a maximum-likelihood estimate is preserved if we consider a function of the maximum-likelihood estimate itself.

  • Sufficiency. Any quantity Ty1yM calculated from samples y1,,yM and used to estimate an unknown parameter vector θ is said to be a sufficient statistic for y1,,yM if no other quantity that can be calculated from the same samples would provide additional information regarding the value of the parameter vector θ. In simple terms, a sufficient statistic is a function of the samples y1,,yM that “compresses” them without losing any information aboutθ. Sufficiency for a maximum-likelihood estimate θ̂ML can be stated by saying that θ̂ML is a function of a sufficient statistic for θ [15].

  • Consistency. Consistency of an estimator regards the behavior of the estimator as the sample size M increases. Consider the case in which y is a set of repeated independent and identically distributed samplesy1,,yM . It is possible to show that, when the range of the elements ofy=y1yM does not dependent on the parameter vector θ, there exists a maximum-likelihood estimate θ̂ML that, as M increases, converges in probability to the true value of the parameter vector. A consistent maximum-likelihood estimate is unique [16].

  • Asymptotic normality. Because the ML estimate θ̂ML of θ is a random variable, it makes sense to consider its probability density function. As the sample size M increases, the probability density function of θ̂ML converges to the probability density function of a normally distributed random variable with mean equal to the true value of the parameter we want to estimate and covariance matrix equal to the inverse of the Fisher information matrix [17].


4. Graphics processing units and CUDA

Driven by the insatiable demand for real-time rendering in gaming and entertainment, graphics processing units (GPUs) have become highly parallel devices capable of running general-purpose code. Newer products that offer an ever-increasing amount of computational power are constantly introduced in the market at very competitive prices.

Programming languages have been developed to harness the parallel capabilities of GPU devices. The most widespread language for GPU programming is called compute unified device architecture (CUDA), which was introduced in 2006 by NVIDIA. Due to its similarity to C, CUDA has rapidly become the de facto programming language for GPUs.

4.1. The CUDA programming model

In CUDA, the GPU is usually referred to as the device and the computer that hosts it is referred to as the host. Many GPU devices can be installed in the same host, and it is not uncommon to have systems with more than one GPU device. Each GPU device has its own memory, which we refer to as device memory. In CUDA, it is also common to refer to the memory installed in the host as host memory. CUDA provides library functions to allocate blocks of memory in device memory and to transfer blocks of data from host memory to device memory and vice versa. As shown in Figure 3, a typical GPU device includes some GPU cores (ranging in number from a few hundreds to a few thousands) and some control logic.

Figure 3.

Diagram of a computer equipped with a GPU device (adapted from [3]).

Programmers access the parallel capabilities of a CUDA-enabled device by writing kernels, which are pieces of code that look very similar to regular C functions. In CUDA, a kernel is run in parallel on many different GPU cores. A kernel in execution is referred to as a thread. Threads are grouped into blocks, which can be 1D, 2D or 3D, and blocks are grouped into a grid. Grids can be 1D, 2D or 3D. The size and dimensionality of blocks and grids are decided by the programmer via an execution configuration, which is also used to call a kernel and instruct the GPU hardware to execute threads.

In a GPU device, thread scheduling is extremely efficient and it is performed by the hardware and without the intervention of the programmer. To improve performance, the hardware also suspends execution for threads that are waiting for completion of memory transfers between device memory and GPU registers. When that happens, the hardware selects for execution threads that already have data to be processed. The programmer is typically unaware of what threads are running at any given time, nor does he know what kernel instruction is being executed by a specific thread. In other words, the programmer cannot rely on any particular scheduling order of GPU threads. There are, however, situations in which it is necessary to ensure that a block of threads has reached a certain instruction in a kernel before all the threads in the block can continue. In CUDA, this is accomplished via synchronization barriers.

Synchronization barriers are often used when threads have to exchange data with each other via shared variables. Without any synchronization mechanism, a thread will not be able to know if the content of a shared variable has already been written by a cooperating thread. Synchronization barriers solve this problem by suspending thread execution until all the threads in the same block have reached a synchronization barrier. In CUDA, synchronization barriers are allowed only among the threads in the same block.

GPU devices are equipped with different memory spaces. This includes global memory (which is used to share data between the host and the device) as well as shared memory. While global memory is rather slow and physically separated from the GPU cores, shared memory is much faster and it is built on the same chip as the GPU cores. Threads use shared memory to efficiently share data among them.

Another type of memory space available in a GPU device is texture memory. As the name suggests, texture memory has been designed to speed up and facilitate 3D rendering in computer games and computer-generated scenes. This is the reason why texture memory supports unique features including hardware-based on-the-fly interpolation of texture data.

4.2. Workflow of a CUDA application

The basic steps that are needed to execute a kernel are summarized in Figure 4.

Figure 4.

Workflow of a CUDA application (adapted from [3]).

In a typically CUDA application, one or more blocks of device memory are allocated by the host via the cudaMalloc(…) CUDA library function. The host then copies input data from host memory via one or more cudaMemcpy(…) function calls. Kernel execution is started with a call of the form my_kernel < <<N, M>> > (…), in which my_kernel is the name of the kernel, N is the grid size and M is the block size. Parameters, such as pointers to device memory, are passed to the kernel as parameters enclosed in parentheses. Once all the threads have finished executing, the control returns to the CPU. Results can be copied from device memory to host memory via one or more calls to cudaMemcpy(…). Finally, device memory that was previously allocated is released via the cudaFree(…) call.

The CUDA environment automatically defines read-only built-in variables that can only be used in a kernel. These variables include blockIdx, blockDim and threadIdx. The variable threadIdx enumerates threads within each block in ascending order and starting from 0. Similarly, blockIdx enumerates blocks within the grid. The size of each block is contained in the variable blockDim. Built-in variables are used by the programmer to calculate which element(s) of the input array(s) a thread has to work on, or where in device memory a result has to be stored.


5. Algorithms for photon-counting detectors

To make our discussion more concrete, we begin this section by considering a GPU algorithm for maximum-likelihood estimation (MLE) of position of interaction of gamma-ray photons in an Anger camera. We then comment on ways to adapt our algorithm to other cases, including photon-counting and photon-processing detectors (Section 6).

We showed in Section 2.1 that digitized PMT signals g1,,gK obey Poisson statistics and we denoted the means of g1,,gK as g¯1RE,,g¯KRE, respectively. If photon energy E is known, the likelihood for the estimation of position R under the assumption of Poisson noise is written as:


Functions g¯1RE,,g¯KRE are called mean detector response functions (MDRFs) and they can be either measured, derived analytically or estimated via simulation codes. Using Eq. (13), an ML estimate R̂ML=x̂MLŷML of R=xy can be found as:


Equivalently, we can consider the logarithm of LRg1gKE in the maximization step and write:


in which we omitted the loggk! term as it does not depend on R and, therefore, it will not affect the estimate R̂ML.

The algorithm we present here uses the fact that, for fixed g1,,gK and E, the log-likelihood Rg1gKE=logLRg1gKE is a smooth function of R. Hence, maximum-likelihood estimate R̂ML can be searched for in an iterative fashion by first evaluating Rg1gKE over a coarse grid of Sx-by-Sy points that uniformly spans the whole detector space. The point of the grid that maximizes Rg1gKE is used in the next iteration as the center of a new grid smaller than the previous one by a factor α>1. This process is repeated M times. We refer this algorithm as the contracting grid algorithm [18, 19].

Figure 5 shows pseudocode for a possible GPU implementation of the contracting grid algorithm. We used superscripts to make it clear on which memory space a given variable is stored. Variables with no superscript will denote either numerical constants (such as the contracting factor α) or local variables, typically stored in GPU registers.

Figure 5.

GPU pseudocode for ML estimation via a contracting-grid search algorithm.

The algorithm of Figure 5 assumes that an array of R PMT sample vectors g0,gR1 is available. These data are stored in device memory and we decided to use in our GPU implementation a grid of size R×1×1 with 2D blocks of size Sx×Sy. This thread hierarchy follows naturally from the data we have to process and how we process them. In fact, the block index is used to index one of the g0,gR1 vectors, while the 2D thread index is used to identify a point of the contracting grid (of size Sx×Sy).

Our GPU implementation uses shared memory to either store data that are used multiple times during thread execution (this would be the case, e.g., of PMT data vector gr) or to share common variables among all the threads in the same block. Each thread in a block calculates the likelihood xygshared for one of the points in the contacting grid and shares the value of the likelihood among all the threads in the same block.

MDRF data (previously estimated via simulation codes [4]) are stored in a 2D layered texture and used during the calculation of the log-likelihood xygshared (denoted as i,jshared in the pseudocode). MDRF data are transparently interpolated by the hardware during texture fetching. Moreover, we set texture boundary conditions so that, should the point xy fall outside the detector’s entrance face, g¯kxy would evaluate to 0. Physically, this can be interpreted as no PMT signals being produced for a gamma-ray “interaction” outside the detector’s entrance face.

Besides code speed and clarity, a layered texture makes our code extremely flexible. By changing Sx, Sy and/or α, it is possible to change the size of the contracting grid or its contracting factor to find the desired trade-off between speed and estimation accuracy.

5.1. Comments and applications to photon counting

The algorithm we discussed above was specifically designed for gamma-ray data and it uses calibration data in the form of mean detector response functions (MDRFs). The output of the algorithm is a list of positions in the form x̂0ŷ0x̂R1ŷR1. This list can directly be fed to an algorithm for list-mode image reconstruction. Implementation details and results are reported in [20]. Common practice, however, is to bin the list-mode data and count the number of points x̂rŷr that fall within each bin. As we argue in [21], one drawback of this step is that it introduces some error, as all the points within each bin are represented with a single point location.

The algorithm we presented in Figure 5 is one example of a contracting grid algorithm for maximum-likelihood estimation. The main assumption we made was that the likelihood Lgθ) (or its logarithm) is a smooth function of θ, the vector of parameters we want to estimate. This is true for many estimation problems. Therefore, the algorithm of Figure 5 provides a general pattern for the implementation of maximum-likelihood estimation on a GPU device.


6. Photon-processing detectors and algorithms

For each photon-absorption event in a detector, there are many parameters we can consider. These parameters include photon position R with respect to a plane or a reference point, direction of propagation s and energy E the photon deposited in the detector. Depending on the application (e.g., single-photon emission computed tomography for 4D angiography or coincidence detection in positron emission tomography), we might also need to consider the time t the photon impinged on the detector. Finally, some imaging techniques (such as two-photon quantum imaging) do require measurements of quantum mechanical parameters, one example being quantum spin.

6.1. Mathematical description

We refer to a set of photon parameters as an attribute vector, and we denote it as A. Hence, depending on the application, an attribute vector might have five or more components. We denote the number of components of A as N. Because of noise, it is not possible to estimate exactly the components of A and we use the notation  to denote an estimated attribute vector.

We define a photon-processing detector as any imaging device that [9]:

  • uses a gain mechanism (such as an image intensifier) to obtain multiple measurements (e.g., multiple pixel values) for each absorbed photon;

  • uses these measurements to perform maximum-likelihood estimation of photon attribute vector Âj, forj=1,,J;

  • stores the estimated attributes at full precision as a list Â=Â1ÂJ and without performing any binning.

Photon-processing detectors are fundamentally different than photon-counting detectors. While photon-counting detectors only consider photon position and record the number of photons (or charged particles) that fall within each bin over a predetermined amount of time, photon-processing detectors use maximum-likelihood to estimate a wide range of attributes and retain all the estimated information at full precision as the list Â=Â1ÂJ.

The full information from a photon-processing detector is retained if we simply store the N estimated attributes for each of J photons as the list Â, but an equivalent construction as a random point process in attribute space offers new theoretical insights. From Â, we introduce this point process as [3, 9]


where δ is the N-dimensional Dirac delta function. The mean of the point process uA is obtained by averaging over the statistics of each of the attribute vectors Â1,,ÂJ and then over J itself for a given objectf. This calculation gives a function u¯Af of A for fixed f. This function can be regarded as a vector u¯f in the Hilbert space L2RN, which is the vector space of square-integrable functions of N real variables. As shown in [22, 23], we can introduce the linear operator L that maps the object f (belonging to the infinite-dimensional Hilbert space L2R3) to u¯f. In symbols,


A similar expression but for photon-counting detectors is:


in which the vector g¯f is the mean over many realizations of the vector g. Vector g¯f belongs to the Euclidian vector space EM, which is the space of all M-dimensional vectors. We refer to H as a continuous-to-discrete operator [8] as it maps the function fr of continuous variable r to a discrete vector g¯f with M components. On the other hands, L is a continuous-to-continuous operator as it maps fr to the function u¯Af of continuous variable A [8].

The key difference between H and L is that H must necessarily have a nontrivial null space, as it maps vectors in an infinite-dimensional Hilbert space to vectors in a finite-dimensional vector space. This means that for any imaging system that produces photon-counting data, there exist nonzero objects fnull that, on average, do not produce any data. Equivalently, we can say that there exist two objects f1 and f2 with f1f2 for which Hf1=Hf2. A continuous-to-continuous operator—such as the operator L defined above—maps an object in an infinite-dimensional Hilbert space to another infinite-dimensional Hilbert space. Therefore, the same dimensionality analysis we considered for H does not apply to L. In fact, L might allow a lower dimensional null space than H for the same imaging system [21].

6.2. Relationship to radiometry and the Boltzmann transport equation

The word “radiometry” refers to a set of techniques used in optics to describe and calculate the distribution of light. An important concept used in radiometry is that of radiance, denoted as Lrs, which is a function that describes the radiant flux in an optical system as a function of three-dimensional spatial position r and direction s. Since the radiant flux is measured in Watts, the units of Lrs are Watts per square meter per steradian, or W/ (m2 ster) [8]. From the radiance, other important radiometric quantities can be calculated. This includes the irradiance (power per unit area), radiant intensity (power per unit solid angle) and radiant flux (power).

Spectral dependence can be introduced in the basic definition of radiance by considering radiance per unit wavelength λ, which we denote as Lλrsλ. The units of Lλrsλ are W/ (m2 ster nm), provided that the units of wavelength are nanometers (nm). Spectral radiance can also be measured in photon units by first expressing wavelength in terms of energy (E=hc/λ, in which h is Planck’s constant and c is the speed of light) and then by dividing Lλrsλ by the energy of a photon. We denote this new quantity as Lp,ErsE, and its units are (photons/s)/(m2 ster). Finally, we can consider a time-dependent spectral photon radiance, and we denote this function as Lp,ErsEt.

In radiometry, the Boltzmann transport equation (BTE) allows to calculate Lp,ErsEt at any point inside an optical system by taking into account absorption, emission, scattering and propagation of light. In its most general form, the BTE is written as [8]:


Each term on the right-hand side can be worked out explicitly [8, 9] to obtain:


where cm is the speed of light in the medium, μtot is the total attenuation coefficient (with contributions from both absorption and scattering processes), K is an integral operator describing the angular and energy dependence of the scattering, and Ξp,E describes any light source. In general, the function Ξp,E depends on r,s, E and t. If the light source is isotropic (i.e., Ξp,ErsEt does not depend on s) and independent of time, we can write


where the 4π term (units: ster) accounts for integration of a solid angle over a sphere. Under these hypotheses, a steady-state solution to Eq. (20) is found by setting the partial derivative Lp,E/t to zero. The result is


which can be further rewritten in operator form as


provided that


We refer to B as the Boltzmann operator. If we insert Eq. (23) into Eq. (17), we get


which describes a practical way to obtain the function u¯f from knowledge of the radiance function Lp,ErsE inside an optical system and the Boltzmann operator B.

6.3. Particle-processing detectors

An example of a particle-processing detector for beta particles is shown in Figure 6. This detector includes two layers of ultrathin phosphor foils separated by an air gap, an image intensifier, a high numerical aperture lens system and a light sensor [1, 7].

Figure 6.

A particle-processing detector for beta particle (adapted from [1]).

In Figure 6, an incoming beta particle interacts with a layer of phosphor (just a few microns thick) at location r1=x1y1 where it deposits some of its energy, which the layer of phosphor gives off as a flash of visible light. The particle further propagates and interacts at r2=x2y2 with a second phosphor layer, thus producing a flash of visible light here as well. Light flashes produced at each layer get amplified by an image intensifier and imaged onto a sensor. The flash of light generated at the first layer spreads out considerably as it propagates through the air gap, thus resulting in a much broader signal on the sensor than that corresponding to the flash of light generated at the second layer. This is used to determine at which layer each flash of light was generated. The direction s of the particle and its position at either phosphor foil can be estimated from the two interaction positions and the distance between the two foils (assumed known). If the particle’s residual energy E is of interest, the second phosphor foil can just be replaced by a thick scintillator, so that the particle is stopped.

A CUDA algorithm for maximum-likelihood estimation of position and particle direction for the setup of Figure 6 has been developed by Garrett Hinton at the Center for Gamma-Ray Imaging, University of Arizona. Following the same approach outlined in Section 5, the algorithm uses a four-dimensional contracting grid to simultaneously estimate location r1=x1y1 and r2=x2y2 from an image frame collected with an ultrafast CMOS camera. Experimental setup and preliminary results have been presented in [1, 7].


7. Summary and applications

This chapter provided a general overview of detector technology and algorithms for photon counting and photon processing. We started by describing detectors suitable for photon counting and photon processing. Statistical models for the data produced by these detectors were presented. We then introduced maximum-likelihood estimation and discussed its properties. We emphasized that a maximum-likelihood estimate is any parameter that maximizes the likelihood function given the detector output, and we pointed out that the likelihood function is the probability density function of the detector output conditioned on the parameter being estimated.

We then commented on graphics processing units (GPUs) and the CUDA programming environment. Through CUDA-like pseudocode, we provided a maximum-likelihood algorithm for estimation of position of interaction for gamma-ray cameras. This algorithm used a contracting-grid approach to find a maximum of the likelihood function. Our approach heavily relied upon GPU textures to quickly retrieve calibration data. The same approach is applicable to many estimation problems.

Photon-processing detectors were introduced and defined as any system that collects multiple measurements to perform maximum-likelihood estimation of multiple event parameters (such as position, direction and energy), which are stored as a list and in full precision in the memory of a computer. The same data can also be represented as a point process, and we introduced a linear operator that maps the object being imaged to the mean of this point process. We used a dimensionality analysis to describe the advantages of photon-processing detectors over photon-counting detectors.

Particle-processing detectors are a variation of photon-processing detector. As an emerging technology, particle-processing detectors will find applications in many fields, one of them being medical imaging. In a new technique, called charged-particle emission tomography (CPET), particle-processing detectors are being evaluated for 3D in vivo imaging with alpha and beta particles [1, 7]. Like photon-processing detectors, particle-processing detectors convey a larger amount of information than conventional detectors for charged particles. This enables high-resolution 3D reconstruction of the distribution of radionuclides emitting charged particles without the need to kill an animal to image a collection of thinly sliced tissue sections.

Drug development will take advantage of CPET to determine drug pharmacokinetics, 3D transduction across cell membranes and targeting to tissues of interest. In the development of internal radioimmunotherapy, CPET imaging can be used to collect data on the 3D heterogeneous distributions of targeting molecules and in the estimation of delivered radiation dose. Finally, CPET will likely become a valuable technique in the emerging fields of personalized medicine and theranostics, in which diagnostics and therapy are combined in an attempt to avoid the “one-size-fits-all” approach to treatment that is often successful for some patients but not for others.



This chapter has been supported by National Institutes of Health (Grants R01 EB000803 and P41 EB002035).


  1. 1. Yijun Ding. Charged-Particle Emission Tomography [dissertation]. Tucson, AZ; 2016.
  2. 2. Anger HO. Scintillation Camera. Review of Scientific Instruments. 1958;29(1):27-33
  3. 3. Luca Caucci. Task Performance with List-Mode Data [dissertation]. Tucson, AZ; 2012
  4. 4. Hunter WCJ, Barrett HH, Furenlid LR. Calibration method for ML estimation of 3D interaction position in a thick gamma-ray detector. IEEE Transactions on Nuclear Science. 2009;56(1):189-196
  5. 5. Llopart X, Campbell M, Dinapoli R, San Segundo D, Pernigotti E. Medipix2: A 64-k pixel readout Chip with 55-μm. IEEE Transactions on Nuclear Science. 2002;49(5):2279-2283
  6. 6. Bouchami J, Gutiérrez A, Houdayer A, Jakůbek J, Lebel C, Leroy C, Macana J, Martin JP, Platkevič M, Pospíši S, Teyssierl C. Study of the charge sharing in silicon pixel detector by means of heavy ionizing particles interacting with a Medipix2 device. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment. 2011;633(Supplement 1):S117-S120
  7. 7. Ding Y, Caucci L, Barrett HH. Charged-particle emission tomography. Medical Physics. 2017;44(6):2478-2489
  8. 8. Harrison H. Barrett, Kyle J. Myers. Foundations of Image Science. Hoboken, NJ: Wiley-Interscience; 2004
  9. 9. Caucci L, Myers KJ, Barrett HH. Radiance and photon noise: Imaging in geometrical optics, physical optics, quantum optics and radiology. Optical Engineering. 2016;55(1):013102
  10. 10. Miller BW, Gregory SJ, Fuller ES, Barrett HH, Barber HB, Furenlid LR. The iQID camera: An ionizing-radiation quantum imaging detector. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment. 2014;767:146-152
  11. 11. Aldrich J. R. A. Fisher and the making of maximum likelihood 1912–1922. Statistical Science. 1997;12(3):162-176
  12. 12. Cramér H. Mathematical Methods of Statistics. Princeton, NJ: Princeton University Press; 1956
  13. 13. Rao CR. Information and the accuracy attainable in the estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society. 1945;37:81-89
  14. 14. Zehna PW. Invariance of maximum likelihood estimators. Annals of Mathematical Statistics. 1966;37(3):744
  15. 15. Moore DS. Maximum likelihood and sufficient statistics. The American Mathematical Monthly. 1971;78(1):50-52
  16. 16. Huzurbazar VS. The likelihood equation, consistency and the maxima of the likelihood function. Annals of Human Genetics. 1947;14(1):185-200
  17. 17. Fisher RA. Theory of statistical estimation. Mathematical Proceedings of the Cambridge Philosophical Society. 1925;22(5):700-725
  18. 18. Furenlid LR, Hesterman JY, Barrett HH. Real-time data acquisition and maximum-likelihood estimation for gamma cameras. In: 14th IEEE-NPSS Real Time Conference; 4–10 June 2005. Stockholm, Sweden; 2005. p. 498-501
  19. 19. Hesterman JY, Caucci L, Kupinski MA, Barrett HH, Furenlid LR. Maximum-likelihood estimation with a contracting-grid search algorithm. IEEE Transactions on Nuclear Science. 2010;57(3):1077-1084
  20. 20. Luca Caucci, William C. J. Hunter, Lars R. Furenlid, Harrison H. Barrett. List-mode MLEM Image Reconstruction from 3D ML Position Estimates. In: IEEE Nuclear Science Symposium Conference Record (NSS/MIC), 30 Oct–6 Nov. 2010. Knoxville, TN, USA; 2010. p. 2643-2647
  21. 21. Luca Caucci, Abhinav K. Jha, Lars R. Furenlid, Eric W. Clarkson, Matthew A. Kupinski, Harrison H. Barrett. Image Science with Photon-Processing Detectors. In: IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), 27 Oct–2 Nov, 2013. Seoul, South Korea; 2013. p. 1-7
  22. 22. Andre Lehovich. List-mode SPECT Reconstruction Using Empirical Likelihood [dissertation]. Tucson, AZ; 2005
  23. 23. Caucci L, Barrett HH. Objective assessment of image quality. V. Photon-counting detectors and list-mode data. Journal of the Optical Society of America A. 2012;29(6):1003-1016

Written By

Luca Caucci, Yijun Ding and Harrison H. Barrett

Submitted: 23 May 2017 Reviewed: 02 November 2017 Published: 20 December 2017