Data transfer speeds for PCI and PCI-X standards
This chapter presents an overview of technological aspects related to data acquisition (DAQ) systems for particle physics experiments. Being a general topic as data acquisition can be, particle physics experiments pose some challenges which deserve a special description and for which special solutions may be adopted.
Generally speaking, most of the particle physics experiments incur in the use of different types of sensors which information has to be gathered and processed to extract the biggest semantic contents about the process being analyzed. This fact implies the need for, not only a hardware coordination (timing, data width, speeds, etc.) between different sub-acquisition systems for the different types of sensors, but also an information processing strategy to gain more significance from the analysis of the data from the whole system that the one got from a single type of sensor. Also, from the point of view of hardware resources, each type of sensor is replicated several times (even millions) to achieve some spatial coverage. This fact directly drives to the extensive use of integrated devices when needed to improve cost and space utilization.
This chapter, thus, will cover the specific technologies used in the different stages in which a general DAQ system in particle physics experiments can be divided.
The rest of the chapter is organized following the natural flow of data from the sensor to the final processing. First, we will describe the most general abstraction of DAQ systems pointing out the general architecture used in particle physics experiment. Second, different common types of transducers used will be described with their main characteristics. A review of the different hardware architectures for the front-end system will follow. Then, we will get into several common data transmission paradigms including modern standard buses and optical fibers. Finally, a review of present hardware processing solutions will be done.
2. Data acquisition architectures
Data acquisition pursues the reading of the information from one or many sensors for its real-time use or storage and further off-line analysis. Strictly speaking we may establish four different activities in a sensor processing system: acquisition, processing, integration and analysis. However, most of the time we refer to DAQ system as the whole of these activities.
It is worth to say that not every DAQ system includes these four activities, depending on their complexity and application. For example, in single sensor systems neither integration nor processing could be necessary. On the other hand, in systems with replicated sensors, processing could be minimal, but the integration is crucial. If the system is based on different types of sensors, processing is necessary to make the readings of the various sensors compatible and the integration is needed to obtain comprehensive information of the environment. However, the majority of the DAQ systems will include the four activities: the physical variable is sensed in the acquisition activity; the data collected is processed properly (for example, performing scaling or formatting) before being transported to the integration activity; the output of the integration is a more meaningful information on which the analysis activity can base its tasks (storage, action on a mechanism, etc.).
2.1. Architectures of sensor systems
As we mentioned before, the DAQ system consists of four activities: acquisition, processing, integration and analysis. Depending on the characteristics of the process under study we will have to choose how to organize them, as we shall see now, to adapt it to our needs .
Collection of sensors
A collection of sensors is a set of sensors arranged in a certain way. They can be in series, parallel or a mixed combination of these two basic arrangements.
The choice of the particular configuration will depend on the application. The integration of information is carried out progressively through the different sensors to get a final result.
In a centralized system, data from the sensors are transmitted to a central processor to be combined. If the volume of data is large, this organization may require considerable bandwidth. For these cases, the DAQ system can be arranged as a hierarchy of subsystems.
Consider the example of the figure 1a. Data D1 and D2 are combined in the stage of integration in the feature F12. Similarly, D3 and D4, D5 and D6 and D7 and D8 constitute characteristics F34, F56 and F78 respectively. Features F12 and F34 are then combined in the stage of feature integration in a local decision Dec 1-4, and F56 and F78 generate Dec 5-8. Finally, local decisions are combined in the stage of decision integration in a global decision Dec 1-8.
The interesting aspect of this organization is that an increase in the "size" of the problem does not translate in a similar increase in the organization of the DAQ, i.e., the system does not grow linearly with the problem. This is true provided that the data and feature fusion stages reduce the volume of information.
Figure 1b shows the integration of various sensors s1, s2, s3 and s4 (not all of the same type). We assume that s1 and s2 are of the same type, s3 of a second type and s4 of a third type. In this case, the integration of the information has to be done to ensure that data from the sensors are compatible.
The processing of the four readings must be carried out sequentially in three phases: t1 = F1 (s1, s2), t2 = F2 (t1, s3) and t3 = F3 (t2, s4). The final output is t3 = F3 (F2 (F1 (s1, s2), s3), s4). We must clarify that the function F1 uses data from the two sensors competitively while F2 and F3 use complementary data.
2.2. Distributed processing of sensors
In the following we will focus on DAQ systems where the four activities described before take place in a distributed way, known as distributed processing systems . This case is of special interest as many of the present DAQ systems for particle physics follow it.
To better understand the different aspects of the processing of sensors let us consider a distributed processing system of sensors with the main objective of detecting targets present in the surveyed space. This example may apply to particle physics experiments but also to other fields like distributed control systems, sensory systems for robots, etc.
Let us assume that there is a finite number of resources (sensors and processors) in the distributed system. Consider a system in which there are N sensors (S1 to SN) and P processors (EP1 EPP). N sensors, for example, can track objects in observation space and we assume that they are all of the same type, that is, they conform a system based on physically replicated sensors. Let us suppose they have been organized in P sensors groups or clusters, for example 3, of N/P sensors each. In our example, there are three groups, each with three sensors and a processor to control them.
The main task, T, is to detect and possibly follow targets across the surveyed space. Consider two possibilities:
The space of observation is too broad and therefore cannot be efficiently covered by any of the clusters of sensors.
The space of observation can be covered by all the groups of sensors, but the system requires a response in real time for the follow-up of the target.
In the first case, part of the space can be assigned to each group of sensors. Collectively, they will cover all the surveyed space. In the second case, we can assign to each cluster the task of following some specific number of targets; ideally, each group should be following a single target.
In our example, the distributed processing system breaks down the main task T in P subtasks; this operation is known as task decomposition. The objective of each subtask Ti is to detect and follow the ith-target in the observation space. Each task is assigned with a processing element, EPi, that controls the three sensors of the group.
Each group of replicated sensors has a local processor. The processor is responsible for local processing and control; it can control the sensors assigned to it and obtain the values from them. Ideally, the sensors of the clusters should always obtain the same value, but in practice they give different values following some statistical distribution.
Suppose that each group can see only part of the space, but the targets can move anywhere within this space. In this case, the system would require a communication between the local processors to share the information about the object and to know when it moves from one area to another.
Finally, the integrator is responsible for combining data from sensors and/or their abstractions. It should be noted that we started with nine sensors. There are three groups of three sensors each. The three sensors on each group provide redundant information. The processing of each group combines the redundant information to obtain a solution of a sub-problem - what object is the one the group is observing?
In this way, the integrator gets three sets of data, each coming from a group of sensors. With these data, the observer determines that there are three objects in the space of observation. The distance from the integrator to the sensor is not, in general, negligible, so the results of the local processing must be transmitted in some way.
In our example, the result obtained by the integrator is a map of the objects present in the whole surveyed space.
The DAQ system is assumed to have a knowledge base that can analyze and interpret the data and take the appropriate action depending on the result obtained. In our case, the system interprets that there are three objects that occupy the space of observation; the reaction of the system will depend on the knowledge base.
2.3. Distributed sensor networks
The use of different, intelligent and distributed sensors space and geographically has grown constantly in applications such as robotics, particle physics experiments, medical images, tracking radar, air navigation and control systems of activities on production lines, to name a few. These systems, and other similar, are called distributed sensors networks or DSN . Otherwise, we could define a network of distributed sensors as a set of intelligent sensors distributed spatially and designed to obtain data of the environment that surrounds them, abstracting the relevant information and infer from it the observed object, deriving from all this, an action appropriate according to the scenario.
2.4. Data acquisition systems in particle physics experiments
The distributed sensor network (DSN) paradigm fits what we generally implement as DAQ systems in particle physics experiments. Because of the need of a spatial coverage or an identification scheme based on the detection of different types of particles, the DAQ system will include several sensors of the same type or different types of multiple replicated sensors. Hardware architectures to read out all them are implemented in a distributed and possibly hierarchical way due to high data volume, high data rate or geographical sensor distribution. Comparison of hierarchical DSN versus other type of solution may be found in [3, 4].
3. Radiation detection. Transducers
Radiation detection involves the conversion of the impinging energy in form of radiation into an electrical parameter which can be processed. In order to achieve this, transducers are the responsible for transforming the radiation energy into an electrical signal. The type of detector has to be specific for each radiation and its energy interval. In general, several factors must be taken into consideration as the sensitivity, the response of the detector in energy resolution, response time and efficiency of the detector.
Energy conversion can be carried out whether in a direct mode, if the signal is directly detected through the ionization of a material (figure 2a), or in an indirect mode, when it performs different energy conversions before obtaining the electrical signal (light production plus electrical conversion, figure 2b). The following sub-sections describe the most commonly used devices for both medical as nuclear physics applications.
3.1. Direct detection
Direct detection with ionization chambers is a common practice. They are built with two electrodes to which a certain electrical potential is applied. The space between electrodes is occupied by a gas and it responds to the ionization produced by radiation as it passes through the gas. Ionizing radiation dissipates part or all of its energy by generating electron-ion pairs which are put in motion by the influence of an electrical field and consequently, by producing an electrical current.
Other possibility that provides good results in radiation detection is a semiconductor detector. They are solid-state devices which operate essentially like ionization chambers but in this case, the charge carriers are electron-hole pairs. Nowadays, most efficient detectors are made of Silicon (Si) or Germanium (Ge). The main advantage is their high energy resolution; besides, they provide linear responses in a wide energy range, fast pulse rise time, several geometric shapes (although the size is limited) and insensitivity to magnetic fields .
3.2. Indirect detection
Scintillators are materials which exhibit luminescence when ionizing radiation passes through them. Material absorbs part of the incident energy and reemits it as light, typically in the visible spectrum. Sir William Crooks discovered this property presented by some materials in 1903 when bombarding ZnS with alpha particles.
Organic scintillators belong to the class of the aromatic compounds like benzene or anthracene. They are made by combining a substance in higher concentration, solvent, and one or more compounds at lower concentrations, solutes, which are generally responsible for the scintillation. They are mainly used for the detection of beta particles (fast electrons with linear response from ~125 keV), alpha particles and protons (not linear response and lower efficiency for the same energies) and also for the detection of fast neutrons. They can be found in different states such crystals, liquid solutions, scintillating plastics with almost every shape and size and in gaseous state.
On the other hand, inorganic scintillators are crystals of the alkali metals such as NaI(Tl), Cs(Tl), LiI(Eu) and CaF2(Eu). The element in brackets is the activator responsible of the scintillation with a small concentration in the crystal. Inorganic scintillators have in general high Z and for this reason they are mainly used for gamma particle detection, presenting a linear response up to 400 keV. Regarding to its behavior towards charged particle detection, they exhibit linear responses with the energy of the protons from 1 MeV and for alpha particles from 15 MeV. However, they are not commonly used to detect charged particles [5,6,7].
As it is shown in figure 2b, the scintillator produces a light signal when it is crossed by the radiation to be detected. It is coupled to a photodetector that will be responsible of transforming the light signal into an electrical signal.
3.2.2. Optoelectronic technology for radiation detection
Light detection is achieved with the generation of electron-hole pairs in the photosensor in response to an incident light. When the incident photons have energy enough to produce photoelectric effect, the electrons of the valence band jump to the conduction band where the free charges can move along the material under the influence of an external electric field. Thus, the holes left in the valence band due to prior removal and displacement of electrons, contribute to the electrical conduction and in this way, photocurrent is generated from the light signal.
220.127.116.11. Photodetectors. Features
One of the main characteristics of a photodetector is its spectral response. The level of electric current produced by the incidence of light varies according to the wavelength. The relationship between them is given by the spectral response, expressed in form of photosensitivity S (A/W) or quantum efficiency QE (%). Another important feature is the Signal to Noise (SNR) Ratio. It is a measure that compares the level of a desired signal to the level of the background noise.
The sensibility of the photodetector depends on certain factors such as the active area of the detector and its noise. The active area usually depends on the construction material of the detector; about the noise level, it is expected that the level of the signal exceeds the noise associated to the detector and its electronics, taking into account the desired SNR. One important component of the noise in the photodetector is the dark current [8, 9]; this current is due to the current flow existing in the photodetector even when they are in a dark environment, both in the photoconductive mode as in the photovoltaic mode. This current is known as dark current with intensities from nA to pA depending on the quality of the sensor.
The light coming from the scintillator is generally of low intensity and because of that, some photodetectors make avalanche processes to multiply the electrons for obtaining a detectable electric signal. Other parameters that determine the quality of the photodetector are the reverse voltage, the time response and its response against temperature fluctuations.
18.104.22.168. Commercial photodetectors
Photomultiplier Tubes (PMT) have been the photodetectors longer employed for a wide number of applications, mainly due to their good features and benefit results. They are used in applications that require measuring low-level light signals, for example the light from a scintillator, converting few hundred photons into an electrical signal without adding a large quantity of noise. A photomultiplier is a vacuum tube that converts photons in electrons by photoelectric effect. It consists of a cathode made of photosensitive material, an electron collecting system, some dynodes for multiplying the electrons and finally an anode which outputs the electrical signal, all encapsulated in a crystal tube. The research carried out in this type of detectors and the evolutionary trend is mainly focused on the improvement of the QE, achieved with the development of the photomultipliers with bialkali photocathode or GaAsP, but it has also been focused on obtaining better time response. In relation to the building material, four of them are commonly used depending on the detection requirements and the wavelength of the light, (Si, Ge, InGaAs and PbS, figure 3).
In such applications where the level of light is high enough, photodiodes use to be the detectors employed due to their lower price but also to their remarkable properties and response. It is a semiconductor with a PN junction sensitive to the infrared and visible light. If the light energy is greater than the band gap energy, the electrons are pulled up into the conduction band, leaving holes in their place in the valence band. If a reverse bias is applied, then there is an electrical current. Thus, P layer in the surface with the N layer both act as a photoelectric converter.
Other important photodetectors such as Avalanche Photodiodes (APDs) have been developed in the last few years . Compared to photodiodes, APDs can detect lower levels of light and they are employed in applications where high sensitivity is required. Although the principle of operation, materials and construction are similar to the photodiodes, there are considerable differences. It has an absorption area A and a multiplication area M which implies an internal gain mechanism that works by applying a reverse voltage. When a photon strikes the APD, electron-hole pairs are created and in the gain area, the acceleration of the electrons is produced; thus, the avalanche process starts with a chain reaction due to successive ionizations. Finally, the reaction is controlled in a depletion area. The output result is, after the incidence of a photon, not only the generation of one or few electrons but a large number of them. In this way, a high level of electric current is obtained from a low level of incident light with gain values around 108.
Silicon Photomultipliers (Si PMT) are promising detectors due to their characteristic features and probably its utilization in many applications will increase during the following years. It is a photon counting device consisting on multiple APD pixels forming an array and that operates in Geiger mode. The addition of the output of each APD results on the output signal of the device, allowing the counting of individual events (photons). One advantage is the low reverse voltage needed for its operation, lower than the one used with PMTs and APDs. When the reverse voltage applied exceeds the breaking reverse voltage, the internal electric field is high enough to produce high gains of the order of 106 [11, 12].
Finally, a CCD camera (charged-coupled device) is an integrated circuit for digital imaging where the pixels are formed with p-doped MOS capacitors. Its principle of operation is based on the photoelectric effect and its sensitivity depends on the QE of the detector. At the end of the exposition, the capacitors transmit their charge which is proportional to the amount of light and the detector is read line by line (although there are different configurations). They offer higher performances regarding QE and noise levels; however their disadvantages are the big size and high price. Figure 4 shows the general features of the different photodetectors.
4. Front-end electronics
When talking about front-end electronics in nuclear or particle physics applications, we usually refer to the closest electronics to the detector, involving processes from amplification, pulse-shape conformation to the analog-to-digital conversion. The back-end electronics are left apart further away from the detector for processing tasks.
In this section, we will introduce the common circuits used in the front-end electronics, such as preamplifiers, shapers, discriminators, ADCs, coincidence units and TDCs.
4.1. Unipolar and bipolar signals
In nuclear and particle physics, usually the signals obtained are pulse signals. Depending on the detector used, different parameters such as the rise or fall time, as well as the amplitude are different. Figure 5 (left) shows a typical pulse signal with all its important parameters. Mostly related to the rise time, it is important to remark the bandwidth of pulse signals, related to the fastest component of the pulse, usually the rise time. A typical criteria to choose the signal bandwidth based on the temporal parameters is to choose a bandwidth such that BW=0.35/tr, where tr is the signal rise time .
Unipolar or bipolar signals show better or worse performance depending on our needs, each of them with different advantages and drawbacks. By definition, a unipolar signal is an only positive or only negative signal while a bipolar signal involves both signs, more interesting for high-counting rate systems, while the unipolar is left for systems with lower counting rates, with a better SNR. On figure 5 (right), an example of a unipolar and bipolar signal can be seen. For more information about this, consult the references [6, 7].
Often in particle physics nuclear and particle physics experiments, the signal obtained at the output of the amplifier is an electrical pulse whose amplitude is proportional to the charge produced by the incident radiation energy. It is quite impractical to provide directly the signal without a proper amplification, and for this reason, preamplifiers are the first stage seen by the pulse signal, usually placed the closest to the detector for noise minimization since noise at this stage is very critical. Two different types of preamplifiers are commonly used depending on the sensing magnitude: Voltage-sensitive amplifiers and charge-sensitive amplifiers.
22.214.171.124. Voltage-sensitive amplifiers
They are the most conventional type of amplifiers, and they provide an output pulse proportional to the input pulse, which is proportional to the collected charge as well. If the equivalent capacitance of the detector and electronics is constant, this configuration can be used. On the other hand, in some applications, as for example semiconductor detectors, the detector capacitance changes with temperature, so this configuration is not anymore useful. Hence, it is preferred to use the configuration called charge-sensitive preamplifiers. The basic schematic of the voltage-sensitive amplifier is shown on the figure 6 (left).
126.96.36.199. Charge-sensitive amplifiers
Semiconductors, such as germanium or silicon detectors are capacitive detectors itself with very high impedance. The capacitance, Ci, for these detectors fluctuates making the voltage-sensitive amplifier inoperable. The idea of this circuit is to integrate the charge using the feedback capacitor Cf. The advantage of this configuration is the independence of the amplitude with the input capacitance if the condition is satisfied. A picture of the charge-sensitive amplifier is shown on the figure 6 (right) .
The feedback resistor Rf is used to discharge the capacitor leading the signal to the baseline level with an exponential tail around 40-50 µs. This discharge is usually done with a high R in order to provide a slow pulse tail, minimizing the noise introduced, but a tail too slow can lead to pile-up effects. Another approach to get rid of the pile-up effect is the optical feedback charge amplifier [6, 7].
4.2.2. Amplifiers and shapers
After the pre-amplification process is carried out, it might be useful to provide a certain shape in order to simplify the measurements of certain magnitudes, preserving the interest magnitude intact. Pulse stretching and spreading techniques can be used for pile-up cancellation, timing measurements, pulse-height measurements and preparation for sampling. Other reasons to use pulse shaper is its SNR optimization, where a certain shape provides the optimal SNR ratio. Most of the shaper circuits are based on differentiator (CR) and integrator (RC) circuits. The circuit schematic and time response are shown in the figure 7 (left). For further information about these circuits, consult the references [7, 15].
188.8.131.52. Shaper networks
Three different pulse shapers will be introduced into this sub-section, although there exist many more. CR-RC network, CR-RC network with pole-zero cancellation and the double differentiating CR-RC-CR circuit are introduced here. CR-RC circuits are implemented as a differentiator followed by an integrator (figure 7a). The differentiated pulse allows the signal to return to the baseline level but it does show neither an attractive pulse nor allows easy sampling of the maximum point when extracting the energy with pulse height analysis (PSA). The integrator stage improves the SNR ratio and smoothens the waveform.
The choice of the time constant often is a compromise between pile-up reduction and the ballistic deficit, which occurs when the shaper produces an amplitude drop. This can be solved by choosing a high value of compared to the rise time, or the charge collecting time from the detector.
When considering a real pulse instead of a step ideal response, CR-RC circuits produce undershoot (figure 7b), which leads to a wrong amplitude level. This can be solved by adding a resistor to cancel the pole of the exponential tail, cancelling the undershoot. If the system counting rate is low, this strategy is useful, but when this counting rate increases, the pulses start to pile-up onto each other creating baseline fluctuations and amplitude distortion.
A solution for this problem is the double differentiating network CR-RC-CR (figure 7c), in which a bipolar pulse is obtained from the input pulse. The main difference resides on the fact that the bipolar pulse does not leave any residual charge, making it very suitable for systems with high-counting rates, but still for systems with low-counting rates, it is preferred to use the unipolar pulses, since its SNR ratio is fairly better.
4.2.3. Discrimination techniques
Discriminator circuits are systems that are activated only if the amplitude of the input signal crosses a certain threshold. Discriminators are used to find the events and to use them as trigger signals, commonly for time measurement. Besides, it blocks the noise coming from previous devices, such as the detector and other electronics stages.
The simplest method for pulse discrimination is the leading edge triggering. It provides a logic signal if the pulse amplitude is higher than a threshold. The logic signal is originated at the moment when the signal crosses the threshold but with the problem called the time-walk effect, which describes the dependence of the pulse discrimination with the signal rise time. This effect can be seen, on the figure 8 left.
Another undesirable effect for pulse discrimination is the time jitter effect (figure 8 right). This effect is caused by statistical fluctuations at the detector and electronics level, and as a difference from the time walks effect. The time jitter is shown as a timing uncertainty when the signal amplitude is constant. This effect comes from the noise introduced in the components, and also the detector sources, as for example the transit time from the electrons in a photomultiplier or the fluctuation of photons produced in a scintillator.
Other methods to avoid or reduce these effects in discrimination systems are zero-crossover timing or constant fraction discrimination methods. Zero-crossover timing method is based on the double-differentiation of the pulse shape. This method, although improves the time resolution and makes independent the crossing point from the amplitude, the shape and rise time still influence the time resolution, making it unsuitable for applications where these fluctuations are very large. The constant fraction discriminators establish the threshold as a fraction of its maximum level. The most common way to implement it is based on the comparison between a fraction of the signal with a slightly delayed version, where the zero-crossing point of its difference makes the pulse independent of the amplitude with lower jitter .
4.2.4. A/D conversion (analog to digital conversion)
More sophisticated algorithms may be implemented digitally inside the logic devices (FPGA or GPUs). Nevertheless, before performing those algorithms, an analog to digital conversion is required, introducing inevitably a source of error due to the sampling and quantization processes. Two of the most common techniques used in nuclear or particle physics research proposals are the Wilkinson method and the FADC (Flash ADC) if a very high sampling rate is required, although other conversion methods such as successive approximation and sub-ranging ADC are used as well.
The main difference between the Wilkinson method and FADC sampling is that Wilkinson method takes one sample per event based on the time measured when a capacitor is discharged, where the time counted is proportional to the pulse charge. On the other hand, FADC takes several samples per event, where the digitized value is taken when comparing the input voltage with a set of resistors forming a voltage divider across all the possible digitized values. Although FADC technique leads to the fastest architecture, as far as the number of bits required is higher, the amount of comparators increases exponentially [6, 16, 17].
The analog-to-digital conversion performance can be tested by measuring certain parameters, such as the differential and integral nonlinearities (DNL and INL), which cause missing codes, noise and distortion, as well as the effective number of bits (ENOB), which quantifies the resolution loss when the distortion and nonlinearities come in. Further information about ADC parameters con be found in  and its measuring method in [17, 18].
4.2.5. Coincidence and anti-coincidence units
These circuits are used to know whether an event has been detected in several detectors at the same time or to detect events only occurring at only one detector. This is especially useful in detector arrays in order to discard fake events. The way to implement it, is based on simple logical operations between the signals from the discriminators [6, 7].
4.2.6. TDC (Time-to-digital converter)
In most of the applications in nuclear and particle physics, the measurement of time intervals will be a primary task. Basically the way to measure time intervals, is based on a start and stop signals, usually given by discriminator circuits. Then, a value proportional to the time interval between the start and stop signals is digitized. Different architectures lead to different performance, but the most notorious for TDC is the time resolution, defined as the minimum time the TDC is capable to measure. Among different architectures, we can mention the TAC (Time-to-amplitude converter), the direct time-to-digital converter, and for higher resolutions, the differential TDC and the Vernier counter .
5. Data Bus systems for back-end electronics
In this section we present the most popular standards used today to build DAQ systems in particle physics experiments. All the systems presented are modular systems with each module carrying out a specific function. This technique allows the reuse of the modules in other systems and makes the DAQ system scalable.
Most of the features of these modular systems as mechanics, data buses characteristics or data protocols are defined in standards. Many DAQ system manufacturers develop their own products according to these standards. The use of standards implies many benefits as the use of third party products and support .
5.1. NIM Standard
NIM stands for Nuclear Instrumentation Module and was established in 1964. NIM standard does not include any kind of bus for data transfer communications since NIM crates only provide power to the NIM modules.
The advantage of the NIM standard is that modules are interchangeable and work as a standalone system allowing the set up of DAQ systems in a simple way, where a module can be replaced without affecting the integrity of the rest of the system . These advantages make the NIM standard very popular in nuclear and particle experiments, and it is still used for small experiments. However, NIM has disadvantages since lacks of a digital bus, not allowing a computer based control or data communication between modules.
5.1.1. Crate and modules
Standard NIM crates have 12 slots for modules and include a power supply that provides AC and DC power to the modules. The power supply is distributed via the backplane to the NIM modules that comprise many different functions like discriminators, counters, coincidences, amplifiers or level converters, for example.
Figures 9a and 9b show a standard NIM crate where we can see the NIM connector in the bottom, and a standard NIM module.
5.2. VME standard
The Versa Module Europa (VME) is a standard introduced by Mostek, Motorola, Philips and Thompson in 1981. It offers a backplane that provides fast data transfer allowing an increase of the amount of transferred data and, therefore, an increase of the channel count coming from the front-end electronics. This fact makes VME standard the widest standard used in physics experiments.
5.2.1. Crate and modules
VME crates contain a maximum of 21 slots where the first position is reserved for a controller module; the other 20 slots are available for modules that can perform other functions.
There are different types of VME modules, each having a different size and a different number of 96-pin connectors that define the number of bits designed for address and data buses.
VME systems use a parallel and asynchronous bus (VMEBus) with a unique arbiter and multiple masters. VMEBus also implements a handshaking protocol with multiprocessing and interrupt capabilities. VMEbus is composed by four different sub-buses: Data Transfer Bus, Arbitration Bus, Priority Interrupt Bus and Utility Bus .
While VMEBus achieves a maximum data transfer of 40 MBps, some extensions of the VME standard as VME64, VME64x and VME320 standards have enhanced its capabilities increasing the number of bits for address and data and implementing specific protocols for data communication . In this way, VME64 systems achieve data transfers up to 80 MBps, while VME64x support data transfers up to 160 MBps and VME320 between 320 MBps and 500 MBps.
Also, VXS standard is an ANSI/VITA standard approved in 2006. VXS standard maintains backward compatibility with VME systems combining the parallel VMEbus with switched serial fabrics. VXS systems achieve a maximum data transfer between modules of 3050 MBps .
5.3. PCI standard
PCI stands for Peripheral Component Interface and was introduced by Intel Corporation in 1991. The PCI bus is the most popular method used today for connecting peripheral boards to a PC providing a high performance bus using 32 bit or 64 bit bus with multiplexed address and data lines. PC-based DAQ systems can be easily built using PCI systems as PCI cards are directly connected to a PC.
5.3.1. PCI cards
The last PCI standard specifies three basic form factors for PCI cards: long, short and low profile . PCI cards are keyed to distinguish between 5V or 3.3V signaling cards and they use different pin count connectors according to the data and address bus widths.
5.3.2. PCI Local Bus
PCI devices are connected to the PC via a parallel bus called PCI Local Bus. Typical PCI Local Bus implementations support up to four PCI boards that share the address bus, data bus and most of the protocol lines, but also having dedicated lines for arbitration. PCI local bus width and clock speed determines the maximum data transfer speed. Table 1 shows a summary of the achievable data transfer speeds in PCI and the extension version of the PCI, called PCI-X .
A disadvantage of the PCI standard is the use of a parallel bus for data and address lines. The skew between lines and the fact that only one master/slave pair can communicate at any time and the handshaking protocol limits the maximum achievable data transfer in PCI .
Further, in 1995, PICMG introduced the compact PCI (cPCI) standard as a very high performance bus based on the PCI bus using Eurocard format boards. But cPCI is not widely used in particle physics experiments due to some additional disadvantages, such as, small size cards, limited power consumption and limited number of slots .
|Standard version||Bus width||Clock speed (MHz)||Data rate (Mbps)|
5.4. PCI Express
PCIe stands for Peripherial Compatible Interface Express. PCIe standard was introduced in 2002 to overcome the space and speed limitations of the conventional PCI bus by increasing the bandwidth while decreasing the pin count. This standard not only defines the electrical characteristics of a point to point serial link communication, but also a protocol for the physical layer, data link layer and transaction layer. Moreover PCIe standard includes advanced features such as active power management, quality of service, hot plug and hot swap support, data integrity, error handling and true isochronous capabilities [26, 27]. As PCI systems, PCIe systems allow the implementation of DAQ systems based on PC.
5.4.1. PCIe cards
PCIe uses four different connector versions: x1, x4, x8 and x16, where the number refers to the number of available bi-directional data path and correspond to 32, 64, 98 and 164 pin connectors respectively. There are two possible form factors for PCIe cards: Long and Short.
5.4.2. PCIe Bus
The PCIe serial bus transmits with a data rate of 2.5 Gbps using LVDS logic standard. But really, the effective data rate is reduced to 80% of the original data rate due to the use of the 8b10b codification . A summary of the achieved data rates per direction using PCIe is shown in table 2:
|Bus||Data rate (MBps)|
5.5. ATCA standard
ATCA stands for Advanced Telecommunications Computing Architecture and was introduced by PICMG in 2002 in the PICMG 3.0 specification. PICMG 3.0 and 3.x specifications define a modular open architecture including mechanical features, components, power distribution, backplane and communications protocols. This specification was created for telecommunication purposes where high speed, high availability and reliability are extremely needed. ATCA systems can deploy a service availability of 99.999% in time .
5.5.1. ATCA shelf
The shelf can allocate a different number of ATCA boards (blades) and it also allocates the shelf manager that is responsible for the power and thermal control issues. Figure 10a shows a 14 slots ATCA shelf with a height of 13U, and figure 10b shows a processor ATCA module.
5.5.2. ATCA modules
Regarding the ATCA modules we can highlight three main types of ATCA blades for the data transport purposes: Front boards, Rear Transition Modules (RTM) and Advance Mezzanine Cards (AMC). All of these modules are hot swappable and have different form factors.
Front boards are connected to the shelf backplane through the Zone 1 and Zone 2 connectors. Zone 1 connector is use to feed the module and Zone 2 connector is use for data transport signals. Moreover Front boards have a third connector, Zone 3, which provides a direct connection with the RTM.
Rear Transition Modules are placed in the rear side of the shelf and it is used to expand the ATCA system functionalities.
AMCs are mezzanine modules pluggable onto ATCA carriers enlarging system functionalities. Examples of AMCs include CPUs, DSP systems or storage.
Moreover, ATCA Fabric interface reaches data rates between modules of 40 Gbps using protocols such as Gigabit Ethernet, Infiniband, Serial Rapid IO or PCIe and network topologies such as dual star, dual-dual star or full mesh. These features provide a clear advantage of ATCA systems over other platforms.
MicroTCA is a complementary specification to the PICMG 3.0 introduced by PICMG in 2006. It was defined to develop systems that require lower performance, availability and reliability than ATCA systems, and also lower space and cost, but maintaining many features from PICMG 3.0 such as shelf management or fabric interconnects .
5.6.1. MicroTCA shelf and modules
The shelf can allocate and manage up to 12 single or double size AMCs. AMCs are directly plugged into the backplane in a similar way than ATCA carriers. The function of the backplane is to provide power to the AMC boards and also connection with the data, control, system management, synchronization clock and JTAG test lines. The MicroTCA backplane can implement network topologies like star, dual star, mesh or point-to-point between AMCs. The protocols used for data communications in the MicroTCA backplane are: Ethernet 1000BASE-BX, SATA/SAS, PCIe, Serial Rapid IO or 10GBase-BX4. Data transfers between AMCs within the MicroTCA backplane can achieve speeds of 40 Gbps.
5.7. Transmission media
In the past, copper wires, as coaxial or twisted cables, were widely used to communicate front-end electronics and back-end electronics, or even modules in back-end systems. For example, many NIM and VME modules use coaxial cables with BNC, LEMO or SMA connectors for control or data communications.
But, nowadays, data transmission media for DAQ has moved to fiber optics due to the advantages of fiber optics over copper cables that make them the best option to transmit data in present particle physics experiments. Some of these features are: EMI immunity, lower attenuation, no electrical discharges, short circuits, ground loops or crosstalk, resistance to nuclear radiation and high temperatures, lower weight and higher bandwidth .
Due to the widespread use of fiber optics, optic modules play an essential role in present particle physics experiments. Optic modules are needed to convert electrical signals into optical ones for transmission via optic fibers. Some examples of optic modules that are used for data transmission in particle physics experiments and their data bandwidths are shown in the table 3 .
|Type||Channels count/fiber ribbon||Bandwidth (max)|
6. Back-end data processing
In the last decades, the improvements in the analog-to-digital converters, in terms of sampling rate and resolution, opened a wide range of possibilities for the digital data processing. The migration from the analog to the digital processing has proven a number of scenarios where the digital approach has potential advantages, such as system complexity, parameter setup changes or scalability. On the other hand, system designers have to deal with bigger amounts of data processed at higher sampling rates, which affects the complexity of the processing algorithms working in real time and the transport of those data at high rates.
For instance, digital processing has demonstrated significant advantages processing pulses from large-volume germanium detectors, where a good choice in the pulse shaping parameters is crucial for achieving good energy resolution and minimum pulse pile-up for high counting rates.
6.1. Common used algorithms
Following the inheritance of the analog data processing, some of the digital data processing algorithms perform similar tasks to the analog blocks; taking advantage of the digital information compiled in the ADCs. These algorithms can be divided in five groups:
Shaping or filtering: When only part of the information from the detector pulses is relevant, such as, for instance, the height of the pulse, shaping techniques can be applied. They filter the digital data according to certain shaping parameters, which could be changed easier in the digital setup. Thus, the only difference in order to apply the same filters in the analog and digital approaches would remain in the continuous or discrete characteristics that differentiate them. In addition, apart from the time-invariant filters similar to the analog ones, in the digital world also adaptive filtering could be applied, changing the filter characteristics for a certain period of time.
Pulse shape analysis: Exploiting the amount of digital information available in the fast digitalization process, different techniques for the analysis of the shape of the pulses can be applied. According to the detector response, these algorithms could be used for obtaining better detector performance or distinguishing between different input particles, as show in Figure 11.
Baseline restoration: During the time gap between two consecutive pulses, the baseline value can also be digitized and easily subtracted from the digital values of the waveform. Also sometimes more elaborated algorithms can be applied in order to calculate the baseline of the pulses. Thus, better system performance is achieved, avoiding changes due to temperature drifts or other external agents.
Pile-up deconvolution: The pile-up effect consists on the accumulation of pulses from different events in a short time, which, in principle, avoids the study of those events. In the analog electronics, this effect usually causes an increment in the dead time of the system, as those events should be rejected. However, taking advantage of the digital characteristics, a further analysis of these pulses can be performed and, consequently, in some cases the information of the compound pulses can be disentangled.
Timing measurements: Timing information is mainly managed in two ways:
Trigger generation: In a fully digital system, pulse information can be used for generating logical signals for validating certain events of interest. Furthermore, in complex systems with different trigger levels the generation of logic signals for the validation or rejection of events becomes very important. For this purpose, two methods are commonly used: leading edge triggering and constant fraction timing. The first is the simplest, and generates the logic pulse performing a comparison between the input pulse and a constant trigger level. In the second, a small algorithm generates the logic pulse from a constant percentage of the pulse height. There are other algorithms like the crossover timing, ARC timing, ELET, etc. but its usage is lower.
Measurement of timing properties: With the trigger information generated either analog or digitally, several logic setups can be performed. Thus, depending on the complexity of the experiment and its own characteristics, trigger pulses can be used for measuring absolute timing between detectors, perform new trigger levels according to certain conditions or filter events according to a specific coincidence trigger.
Although the needs of the experiments and the complexity of the setups changes enormously, the processing algorithms usually can fit into one of the categories described previously. However, it is also common to combine several algorithms in the process, so the system architecture can be quite elaborated. The split of the system into several firmware and software blocks allows designers and programmers to manage the difficulty of the experiment.
6.2. Hardware choices
The complexity and performance of the algorithms previously presented varies depending on the input data, the sampling rate or the implementation architecture. For the last one, several options are presented according to the experiment needs:
Digital Signal Processing (DSP): The DSPs are Integrated Circuits (ICs) that perform programmable filtering algorithms by applying the Multiply-Accumulate (MAC) operation. These devices have been used since more than 30 years, especially in other fields, such as audio, image or biomedical signal processing. The wide knowledge acquired with these devices has contributed to its use within the particle experiments network, taking advantage of its compactness or stability with temperature changes.
Field-Programmable Gate Array (FPGA): These programmable ICs are composed by a large number of gates that can be individually programmed and linked, as it is depicted in figure 12. They are usually programmed using Hardware Description Languages (HDLs), like the Application-Specific Integrated Circuits (ASICs), so, in theory, they could implement any of the algorithms presented previously. Furthermore, these devices use to embed other DSPs or microprocessors, which allow performing different processing tasks, even concurrently.
Central Processing Unit (CPU): In some cases, a Personal Computer (PC) with specific memory or CPU characteristics is used. In this case, the data are treated with software programs together with the operating system. They are often included as an interface to long-term memories, i.e. to handle the storing process. However, sometimes additional data processing is required, and they are particularly efficient when there is a high amount of data, which are not needed to be processed at a high frequency. Moreover, data processing at this point sometimes requires a lot of computing resources, so these processes use to run in computer farms, which are handled by distributed applications or operating systems.
Graphics Processing Unit (GPU): When parallel processing and the amount of data to manage increase, CPUs may not be the best hardware architecture to support it. Thus, the underlying idea of the GPUs is to use a Graphical card as a processing unit. They have been shown as very efficient hardware setups that can be more efficient than CPUs in some configurations.
Grid computing: This technique, which is used in large experiments where the amount of data is not even manageable by computing processing farms, consists of taking advantage of the internet network to communicate computing farms or PCs in order to perform a large data processing with an enormous number of heterogeneous processing units. Obviously this technique is implemented with data without timing constraints, as the processing time for each unit may be different.
Among this short description of data processing hardware blocks, an overview of all possible units has been presented. However, it is important to remark that the final system requirements may advise to select a certain setup. For this purpose, the reader is encouraged to review the bibliography for further details.
7. Application examples
This section reviews at a glance some implementations of DAQ in particle physics experiments and medical applications in order to clarify the concepts on detectors, algorithms and hardware units previously described.
The first example regards to the data processing in the Advanced GAmma Tracking Array (AGATA) . As it is a “triggerless” system, the detector signals (HPGe crystals) are continuously digitized and sent to the pre-processing electronics, which implement shaping, baseline restoration, pile-up deconvolution and trigger generation algorithms in FPGAs in a fully digital way. After that, crossing different bus domains, the data arrive to a PC that performs Pulse Shape Analysis algorithms to calculate the position of the interaction and add it to the energy information calculated in the preprocessing. Then, data coming from all detectors arrive to a PC (Event Builder), using a distributed Digital Acquisition program. When the event is generated, its information is added to the data from other detectors (ancillaries) in a PC called “Merger”, that sends them to the tracking processor. This is another PC that performs tracking algorithms for reconstructing the path of the gamma-rays in the detectors. Finally, the data is stored in external servers.
In this example, most of the presented algorithms and hardware configurations are used. In addition, GPUs have been tested for the Pulse Shape Analysis and they have shown excellent performance characteristics. Also Grid computing techniques are used for data analysis and storage, so the hardware components previously presented are almost covered in this example.
The second example is the data processing in the ATLAS detector, at CERN . In this case, the system is composed of several different types of sensors and the DAQ system is controlled by a three level trigger system. In the first level of trigger, the events of interest recorded in the detectors, mostly selected by comparators, are directly sent to the second level of trigger. In this level, the information from several sub-detectors is correlated and merged according to the experimental conditions. Finally, a third event filtering is carried out with the data from the whole system. Along these trigger levels, different processing algorithms are used, combined in different hardware setups. However, all of them can be included in one of the categories previously detailed. An example of the hardware systems developed for this particle physics experiment may be found in [35, 36].
The last example is from radiation therapy where ionizing radiation is used with medical purposes. Nowadays, radiologists make use of radioactive beams i.e. gamma particles, neutrons, carbon ions, electrons, etc. to treat cancer, but they also take advantage of the properties of the ionizing radiation and its application in the diagnosis of internal diseases through medical imaging. This field has involved the development of devices capable of, on one hand, producing the radioactive particle needed for a specific treatment and, on the other hand, to detect the radioactive beam (in some cases, the part of the radiation that has not been absorbed by the patient) to reconstruct the internal image. This is the case of the Computed Tomography (CT) scan, a medical imaging technique consistent on an X-Ray source (X-Ray tube) that rotates 360º around the patient providing at each rotation a 2D cross-sectional image or even a 3D image by putting all the scans together through computing techniques. The detection of the X-Rays is carried out whether in a direct or in an indirect way depending on the device; actually, the detection area consists from one up to 2600 detectors of two categories, scintillators (coupled to PMTs) or gas detectors. Another well-known imaging technique is the Positron Emission Tomography (PET). It provides a picture of the metabolic activity of the body thanks to the detection of the two gamma rays that are emitted after the positron annihilation produced by a radionuclide previously inserted into the patient. Gamma detection is achieved by placing scintillators generally coupled to PMTs but also to Si APDs.
In this chapter we have presented a review of the technologies currently used in particle physics experiments following the natural path of the signals from the detector to the data processing. Even these kinds of applications are well established there is not a comprehensive review, as this chapter tries in a very light version, of the overall technologies commonly in use. Being a wide field, we have tried to be concise and provide the interested reader with a list of references to consult.
This work is funded by Spanish Commission of Science and Technology under project reference FPA2009-13234-C04-02.