An Optimization of 16-Point Discrete Cosine Transform Implemented into a FPGA as a Design for a Spectral First Level Surface Detector Trigger in Extensive Air Shower Experiments

In this book the reader will find a collection of chapters authored/co-authored by a large number of experts around the world, covering the broad field of digital signal processing. This book intends to provide highlights of the current research in the digital signal processing area, showing the recent advances in this field. This work is mainly destined to researchers in the digital signal processing and related areas but it is also accessible to anyone with a scientific background desiring to have an up-to-date overview of this domain. Each chapter is self-contained and can be read independently of the others. These nineteenth chapters present methodological advances and recent applications of digital signal processing in various domains as communications, filtering, medicine, astronomy


Introduction
The Pierre Auger Observatory is a ground based detector located in Malargue (Argentina) (Auger South) at 1400 m above the sea level and dedicated to the detection of ultra high-energy cosmic rays with energies above 10 18 eV with unprecedented statistical and systematical accuracy.The main goal of cosmic rays investigation in this energy range is to determine the origin and nature of particles produced at these enormous energies as well as their energy spectrum.These cosmic particles carry information complementary to neutrinos and photons and even gravitational waves.They also provide an extremely energetic stream for the study of particle interactions at energies orders of magnitude above energies reached at terrestrial accelerators (Abraham J. et al., 2004).The flux of cosmic rays above 10 19 eV is extraordinarily low: on the order of one event per square-kilometer per century.Only detectors of exceptional size, thousands of square-kilometers, may acquire a significant number of events.The nature of the primary particles must be inferred from properties of the associated extensive air showers (EAS).The Pierre Auger Observatory consists of a surface detectors (SD) array spread over 3000 km 2 for measuring the charged particles of EAS and their lateral density profile of muon and electromagnetic components in the shower front at ground, and of 24 wide-angle Schmidt telescopes installed at 4 locations at the boundary of the ground array measuring the fluorescence light associated with the evolution of air showers: the growth and subsequent deterioration during a development.Such a "hybrid" measurements allow cross-calibrations between different experimental techniques, controlling and reducing the systematic uncertainties.Very inclined showers are different from the ordinary vertical ones.At large zenith angles the slant atmospheric depth to ground level is enough to absorb the part of the shower that follows from the standard cascading interactions, both of electromagnetic and hadronic type.Only penetrating particles such as muons and neutrinos can traverse the atmosphere at large zenith angles to reach the ground or to induce secondary showers deep in the atmosphere and close to an air shower detector.
The ability to analyze inclined showers with zenith angles larger than 60 • induced by neutrinos or photons essentially increases the acceptance of the surface array and opens a part of the sky that was previously inaccessible to the detector.These showers provide a new tool for ultra high energy cosmic rays interpretation because they are probing muons of significantly higher energies than vertical showers.Spectral triggers offering a pattern recognition in a frequency domain may improve a standard detection technique based on the signal coincidences from many PMT channels above some thresholds in the time domain.The "old" muon shower fronts have only a small longitudinal extension, which is leading to short detector signals also in time.To identify these showers at the presence of "young" showers with a large electromagnetic component one may need a very good spectral sensitivity to the fast muon component in the trigger.The main advantage of the spectral trigger is the scaling feature.The set of the DCT coefficients depends only on the shape of signals, not on their amplitudes.Triggers sensitive on the shape of FADC traces may detect events with expected characteristics i.e. the fast attenuated, very short peaks related to the muonic, flat fronts coming from very inclined showers.Independence of the amplitude is especially promising for the Auger North, where due to a single PMT in the surface detectors the coincidence technique cannot be used.In order to keep reasonable trigger rate for the 1st level trigger (ca. 100 Hz), the threshold for the 1st trigger should be much higher than for example in the Pierre Auger Observatory, where 3-fold coincidences attenuated a noise.

Triggers
Two different triggers are currently implemented at the 1st level.The first is a single-bin trigger generated as 3-fold coincidence of the 3 PMTs at a threshold equivalent to 1.75 vertical emitted muons.The estimated current for a Vertical Equivalent Muon (I VEM ) is the reference unit for the calibration of FADC traces signals and corresponds to ca.50 ADC-counts.This trigger has a rate of about 100 Hz.It is used mainly to detect fast signals, which correspond also to the muonic component generated by horizontal showers.The single bin trigger is generated when the input signal is above the fixed thresholds calculated in the micro-controller during the calibration process.It is the simplest trigger useful for high-level signals.The second trigger is the Time over Threshold (ToT) trigger that requires at least 13 time bins above a threshold of 0.2 I VEM .A pre-trigger ("fired" time bin) is generated if in a sliding time window of 120 × 25 ns length a coincidence of any two channels appears.This trigger has a relatively low rate of about 1.6 Hz, which is the expected rate for two muons crossing the Auger surface detector.It is designed mainly for selecting small but spread-in-time signals, typical for high energy distant EAS or for low energy showers, while ignoring the single muon background (Abraham J. et al., 2010).Cherenkov light generated by very inclined showers crossing the Auger surface detector can reach the PMT directly without reflections on Tyvec liners.Especially for "old" showers the muonic front is very flat.This together corresponds to very short direct light pulse falling on the PMT and in consequence very short rise time of the PMT response.For vertical or weakly inclined showers, where the geometry does not allow reaching the Cherenkov light directly on the PMT, the light pulse is collected from many reflections on the tank walls.Additionally, the shower developed for not so high slant depth are relatively thick.These give a signal from a PMT as spread in time and relatively slow increasing.Hadron induced showers with dominant muon component give an early peak with a typical rise time mostly from 1 to 2 time bins (by 40 MHz sampling) and decay time of the order of 80 ns (Aglietta et al., 2005).The estimation of the rise time for the front on the base of one or two time bins is rather rough.The rise time calculated as for two time bins may be overestimated due to a low sampling rate and an error in a quantization in time.Higher time resolution would be favorable.The expected shape of FADC traces suggests to use a spectral trigger, instead of a pure threshold analysis in order to recognize the shape of the FADC traces characteristic for the traces of very inclined showers.The monitoring of the shape would include both the analysis of the rising edge and the exponentially attenuated tail.A very short rise time together with a relatively fast attenuated tail could be a signature of very inclined showers.We observe numerous very inclined showers crossing the full array but which "fire" only few surface detectors (Fig. 1).For that showers much more detectors should have been hit.Muonic front probably produces PMT signals not high enough to generate 3-fold coincidences, some of signals are below of thresholds (see Fig. 2).This may be a reason of "gaps" in the array of activated surface detectors.

Discrete Fourier Transform vs. Discrete Cosine Transform
There are several variants of the DCT with slightly modified definitions.The DCT-I is exactly equivalent (up to an overall scale factor of 2), to a DFT of 2N-2r e a lnumbers with even symmetry.The most commonly used form of the Discrete Cosine Transform is DCT-II.detectors are below the standard thresholds and they are detected by chance (compare a registration efficiency for a similar event shown in Fig. 1).For all very inclined showers the rising edge corresponds to one or two time bins. where for k ≥ 1.The DCT-III form is sometimes simply referred to as "the inverse DCT" (IDCT).A variant of the DCT-IV, where data from different transforms are overlapped, is called the Modified Discrete Cosine Transform (MDCT).The DCT is a Fourier-related transform similar to the DFT, but using only real numbers.DCT are equivalent to DFT of roughly twice the length, operating on real data with even symmetry (since the Fourier transform of a real and even function is real and even), where in some variants the input and/or output data are shifted by half a sample.The DCT-II and DCT-IV are considered as the alternative approach to the FFT.In fact, the FFT routine can be supplied in an interleaving mode, even samples treated as real data, odd samples as imaginary data.A trigger based on Discrete Fourier Transform (DFT) (Radix-2 FFT) (Szadkowski, 2006) has already been implemented in the 3rd generation of the Front FEB based on Cyclone™ Altera ® chip (Szadkowski, 2005b).However, for real signal x n and N 2 th spectral line of Xk , k = 0,1,...,N-1 is lying on a symmetry axis: the real part is symmetric, the imaginary part is asymmetric.The useful information is contained only in 1 st N 2 + 1 spectral lines for k = 0,1,...,N/2 corresponding to frequencies

Pedestal independence
The analog section of the FEB has been designed to have a pedestal of ca. 10 % of the full FADC range in order to investigate undershoots.However, the pedestal is relatively sensitive on the temperature.Daily variation of the pedestal may reach 5 ADC-counts.The trigger pedestal-independent is very welcome.Let us consider signal with a constant pedestal: Due to symmetry and parity of the cosine, we get for odd and even indices respectively: By a recursion, repeating (5) we get finally In a consequence fork>0theDCTcoefficients are independent of the pedestal.

Scaling
The DCT algorithm has a significant advantage in comparison to the FFT one.The structure of DCT coefficients is much simpler for interpretation and for a trigger implementation than the structure of the FFT real and imaginary coefficients (compare 4th of the FFT data vs. 2nd row for the DCT coefficients in Fig. 3).For the exponentially attenuated signals from the PMTs higher DCT coefficients (scaled to the 1st harmonics) are almost negligible, while both real and imaginary parts of the FFT (scaled to the module of the 1st harmonics) give relatively significant contributions and are not relevant for triggering.
When a peak appears in the pure attenuated signal (last column in Fig. 3) the structure of the DCT dramatically changes and trigger condition immediately expires, while modules of FFT components almost do not change.The structure of FFT harmonics for the last graph in Fig. 3 would be more suitable for a trigger (almost negligible imaginary part for higher harmonics and also relatively low real harmonics), however it corresponds just to situation, when the (4th row).The 1st column shows the pulse (shape A), when two time bins are on the pedestal level, the 2nd one (shape B), when only the one time bin is still on the pedestal level, while the 3rd one (shape C) shows the pulse fully fulfilled the range of investigating shift registers.For a signal shape related to the exponential attenuation (shape C), the contribution of higher DCT coefficients is small and suitable for a trigger.When a peak appears in the declining signal (last column -shape D), the DCT coefficients immediately excesses assumed relatively narrow acceptance range for triggers.The DFT coefficients (Re and Im in 4th row) have similar structure as the DCT, however for the pure exponentially declining signal the higher real DFT harmonics have relatively high values and they are not suitable for triggering.Absolute values of DFT components (3rd row) are clearly insensitive on discussed conditions.

384
Applications of Digital Signal Processing www.intechopen.compure attenuated signal is distorted by some peak on the tail and a trigger condition has been violated.
The plot in the 4th row and 3rd column on Fig. 3 shows a contribution of the DFT vs. the absolute value of the 1st harmonic.For an exponential attenuated signal (with the attenuation factor = β) the contribution of both real and imaginary coefficients decreases monotonically with a significant value for all real coefficients.From the DFT definition we get: where φ = 2πk N .Calculating (8) for boundary factors β = (0.28, 0.42) (from the Auger database) and for k = N/2 (as the lowest in a monotonically decreasing chain), we obtain forN=16: ξ = 24% and 28%, respectively.These values are too large to be use for triggering.Even an extension of the DFT size does not help very much.For N = 32: we get still large values: ξ = 17% and 23%.Almost vanishing higher DCT coefficients provide much natural trigger conditions.32-point FFT (roughly equivalent to 16-point DCT) does not offer better stability.

Genaral DCT algorithm
The DCT for real signal x n gives independent spectral coefficients for k = 0,1,...,N-1, changing f k also from zero to 2N grid.DCT vs. DFT gives twice better resolution.Splitting the sum (1) and redefine the indices we get: Due to symmetry of the cosine function We can introduce the new set of variables: DCT coefficients can be separated for even and odd indices respectively: Let us notice that (13) for even indices has the same structure as (1) with only shorter range of indices.Recurrently we can introduce new sets of variables for the set of indices k = 2p, where p is integer, tillk<N.Inordertousesymmetry of trigonometric functions in a maximal way, N should be a power of 2, similarly to Radix-2 approach used in FFT algorithm.IfN=2 q , recurrent minimization is possible till p = q.The twiddle factors for successive minimization steps m equal to cos 2π 2 q 2 p+m 2 = −1 , because the sum of step index m and range factor p is constant and equals to q.For the rest of indices twiddle factor depends on fractional angle α = π2 q−m−1 N .After the 1st step of minimization, the terms of the sum (13) for odd indices depends only on the odd multiplicity of the fractional angle Using a following trigonometric identity the fractional angles can be increased by the factor of 2 for β = kπ 2N .Thus: Let us notice that: 1).cos(kπ)=(−1) k , for n = N-1, hence pure A n coefficient survives, 2).cos( kπ 2 )=0, for n = N 2 because of odd k, 3). the rest of indices appear in cosine terms twice in A n+1 and A n coefficients, which allows introducing the new set of variables The range of B n indices is continuous and can be split again on even and odd parts.The above procedure can be repeated in recurrence.

8-point DCT algorithm
ForN=8according to formulae ( 12) and ( 17) we get : For even indices the DCT coefficients are expressed as follows: where For odd indices with a support of (15) we get: A direct approach from the classical definition requires: a single multiplication for even indices (20) and 5 multiplications for odd indices ( 22).The scaled coefficients S 1,7,3,5 X1,7,3,5 in ( 22) can be expressed in an equivalent way introduced by Arai, Agui, Nakajima (AAN, 1988)., which allows reducing an amount of multiplications from 5 to 4 only.
4 Multiplications in powerful FPGA chips can be however performed in very fast dedicated DSP blocks in a single clock cycle.Signals processed in parallel threads in a hardware implementation of a pipeline design have to be synchronized to each other.Pipeline approach requires additional shift registers for synchronization also for signal currently not being processed.However, such synchronization needs additional resources.Fig. 5 shows the part of pipeline chain corresponding to odd indices of DCT coefficients (lower part in Fig. 4).

387
An Optimization of 16-Point Discrete Cosine Transform Implemented into a FPGA as a Design for a Spectral First Level Surface Detector Trigger in Extensive Air Shower Experiments www.intechopen.com Fig. 5.The AAN algorithm limited to indices4-7only with a time-oriented structure.Adders, sub-tractors, multipliers and shift registers are marked by the following colours: blue, gray, black and green, respectively.Red colour corresponds to routines requiring a cascade processes.
A direct implementation of the pure AAN algorithm requires 7 pipeline stages, which utilize additional resources of shift registers for synchronization for operations like: X(t+1) = X(t).In a numerical calculation in processors data are simply waiting for a next performance cycle.The D 64 block contains a cascade of the sum and the multiplication.An implementation of the cascade in a single clock FPGA logic block significantly reduce a speed.Additionally, the lpm_add_sub mega-function from the Altera ® library of parameterized modules (LPM) does not support an inversion of a sum i.e.B 4 = −(A 4 + A 5 ) or E 4 = −(D 64 + D 4 ).These operations would have to be performed in a cascade way by an adder and a sign inversion.Cascade operations performed in the same clock cycle significantly slow down a global registered performance.Routines E and F from Fig. 5 have been merged into single routine E (Fig. 6) to short an amount of pipeline stages and remove unnecessary shift registers.
B  A classical approach reduces a length of the chain from 6 to 5 stages only, at the cost of one additional multipliers.An abridgement of the pipeline chain and in a consequence a reduction of the shift registers needed for synchronization allows saving significant amount of logic blocks, especially for wide data bus.In order to reduce an approximation errors, the data bus in the intermediate stages is enlarged.

16-point DCT algorithm
The 16-point DCT algorithm will be implemented according to the classical approach with an optimization of the number of pipeline stages at the cost of an utilization of embedded multipliers (Szadkowski, 2009).The 1st and the 2nd pipeline stages utilize the set of variables ( 12) and ( 17) respectively.For N = 16 the fractional angle of the twiddle factor in the 1st step of minimization equals to β = π .The same fractional angle corresponds to the 2nd step of minimization for even indices corresponded to A n .

Implementation of the code into a FPGA
The spectral trigger should be generated if DCT coefficients normalized to the 1st harmonics are in an arbitrary narrow range: where Thr L k and Thr H k are lower and upper thresholds for each spectral index k, respectively.Altera ® Library of Parameterized Modules (LPM) contains the lpm_divide routine supporting a division of fixed-point variables.However, this routine needs huge amount of logic elements and it is slow (calculation requires 14 clock cycles in order to keep sufficiently high registered performance).DSP blocks also do not support this routine.A simple conversion to According to (44) the calculation of a sub-trigger needs two multipliers, two comparators and an AND gate.The multiplier stage of an embedded multiplier block supports 9 × 9or18× 18 bit multipliers.Depending on the data width or operational mode of the multiplier, a single embedded multiplier can perform one or two multiplications in parallel.Due to wide data busses embedded multiplier blocks do not use the 9×9 mode in any multiplication.Each multiplier utilizes two embedded multiplier 9-bit elements.The full DCT procedure needs the calculation of all coefficients 70 DSP blocks.However, the scaling of Xk in the last pipeline chain is no longer needed.It is moved to the thresholds according to (44).Removing last pipeline chain reduces amount of DSP blocks to 40.Sub-triggers routines (Fig. 9) need 2 DSP blocks each.The chip EP3C40F324I7 selected for the 4th generation of the 1st level SD trigger contains 252 DSP 9-bit multipliers.So, for 3-fold coincidences and an implementation of 3 "engines" the single DCT "engine" can support only 11 independent DCT coefficients (Szadkowski, 2011) k and D 0 k are generated for the patterns A k , B k , C k and D k (k = 2,4,6) from Fig. 3, respectively.Sub-triggers are synchronized to each other in shift registers in order to put simultaneously on an AND gate (Fig. 11).In order to keep a trigger rate below the boundary deriving from the limited radio bandwidth, additionally the amplitude of the jump is verified.If the jump is too weak, a veto comparator disables the AND gate.Thus, if spectral coefficients ξ k match pattern ranges for each time bins selected by multiplexer totally in 4 consecutive time bins and if veto circuit is enabled the final trigger is generated.A delay time for the veto signal depends on the type of shape, which is an interest of an investigation.For the single time bin of the rising edge the veto is delayed on 3 clock cycles, for the investigated pattern corresponding to the three time bins of the rising edge the maximal ADC value appears 2 clock cycles later in comparison to the previous case, so the veto should be delayed on a single clock cycle only.
…………………........The 16-point DCT with 16-stage shift register for 100 MHz sampling can cover 150 ns time window.For the horizontal or very inclined showers this interval is sufficient for the analysis.However, for the higher sampling frequency, when the time window may turn out too short, the shift register may be extended from 16 to 24 stages and the eight samples for the higher indices may be taken from the last 16 shift register nodes according to the Fig. 11.The samples with higher indices correspond to the exponentially attenuated tail and the analysis of the tail is lest critical than the rising edge, where samples are analyzed with a full speed.3 DCT trigger "engines" have been successfully merged with the Auger code working with 100 MHz sampling.The final code utilizes only 38gives an opportunity to add new, sophisticated algorithms.The slack reported by the compiler corresponds to a maximal sampling frequency 112 MHz, which gives a sufficient safety margin for a stable operation of the system.For sufficiently high amplitudes of the ADC samples the Threshold trigger will be generated 32 clock cycles earlier than the spectral trigger (24 clock cycles of propagation in the shift registers + 8 clock cycles of performance in the DCT chain).If the Threshold trigger has been already generated, the next triggers are inhibited for 768 time bins necessary to fulfill memory buffers (see Fig. 7 in (Szadkowski, 2005a)).Because the Threshold trigger (sensitive to bigger signals) has a higher priority than the spectral trigger, ADC samples will not be delayed for the Threshold trigger in order to synchronize it with the spectral one.The system uses 10-bit resolution (standard Auger one).A compilation for the 12-bit resolution for the current chip EP3C40F324I7 failed, due to a lack of the DSP blocks.12-bit system requires bigger chip EP3C55.The slack times are on the same level as for EP3C40.All pipeline routines shown in Fig. 8 are implemented in a direct mode (no pipeline mode -like i.e. in the 2nd generation of the FEB based on the ACEX family (see Fig. 2 in (Szadkowski, 2005a)) or for the FFT implementation in the Cyclone family (Fig. 2 in (Szadkowski, 2005b)
According to above estimations, the configuration with 3 "engines" does not support all ξ k sub-triggers due to limited amount of DSP blocks.However, for the next generation of the water Cherenkov detectors array, where probably only a single PMT will be used, 3 "engines" will be implemented to investigate and to detect 3 different shapes of FADC traces corresponding to i.e. different rise times of the rising edge.

Preliminary tests
Analysis of Auger ADC traces of very inclined showers shows that the maximum of the signal is mostly reach in a single time bin.The attenuation factor for a tail is in the range of β = (0.2 -0.5).All signals with first two time bins on the pedestal level for sure will be with only one time bin on the pedestal level in the next clock cycle.But, not vice versa.A signal with only a single time bin on the pedestal level before sharp rising edge can have significant contribution in the 2nd time bin before rising edge and it will not be recognized by a pattern recognition procedure tuned on the Shape_A.A procedure recognizing Shape_A is more restrictive and gives lower trigger rate than for the Shape_B.Due to limited amount of the DSP blocks only 11 DCT coefficients can be analyzed simultaneously.For the Shape_A the X4 and X10 are ignored and for the Shape_B : X6 and X14 , respectively, as weakly sensitive on changes of signal shapes.The trigger based only on the DCT pattern recognition gives too high rate, due to a contribution of very week signals with also appropriate shape, but usually treated as noise.In order to reduce and control the trigger rate, the veto threshold has been introduced.The calculation of the DCT coefficients in the pipeline chain and next the calculation of sub-triggers in multipliers and comparators block takes 12 clock cycles.The signal is synchronized with the DCT sub-triggers delayed the same time to be compared with the veto threshold, simultaneously with a generated DCT sub-triggers.If the signal is above the sum of the veto threshold and the pedestal, the sub-triggers are enabled to generated a final spectral trigger.The condition that all 11 DCT coefficients were inside the acceptance lane is too strong.The shapes are not ideal, noise introduces additional shape distortions.Similarly as in the ToT trigger only a part of "fired" sub-triggers (Occupancy ≤ 11 = max.number of sub-triggers) is enough to generate the final spectral trigger.

Fig. 1 .
Fig. 1.Position of triggered surface detectors on the Auger array for the very inclined shower (θ = 83.5 • ) nr 1155555.Muons triggered only few surface detectors, although they crossed several hundred detectors.A distance between opposite detectors is 54 km.

383An
Fig.3.A propagation of the pulse (1st row) through the shift register, DCT-II coefficients (2nd row), absolute values of the DFT (3rd row) and corresponding real (Re), imaginary parts (Im) (4th row).The 1st column shows the pulse (shape A), when two time bins are on the pedestal level, the 2nd one (shape B), when only the one time bin is still on the pedestal level, while the 3rd one (shape C) shows the pulse fully fulfilled the range of investigating shift registers.For a signal shape related to the exponential attenuation (shape C), the contribution of higher DCT coefficients is small and suitable for a trigger.When a peak appears in the declining signal (last column -shape D), the DCT coefficients immediately excesses assumed relatively narrow acceptance range for triggers.The DFT coefficients (Re and Im in 4th row) have similar structure as the DCT, however for the pure exponentially declining signal the higher real DFT harmonics have relatively high values and they are not suitable for triggering.Absolute values of DFT components (3rd row) are clearly insensitive on discussed conditions.

Fig. 4 .
Fig. 4. A fast DCT algorithm developed in 1988 by Arai, Agui and Nakajima A minimization of multiplications amounts is one of a fundamental goal in long-term numerical calculations.Reduction of product terms significantly speed up sophisticated calculations, because a single multiplication requires several clock cycles of processor.Multiplications in powerful FPGA chips can be however performed in very fast dedicated DSP blocks in a single clock cycle.Signals processed in parallel threads in a hardware implementation of a pipeline design have to be synchronized to each other.Pipeline approach requires additional shift registers for synchronization also for signal currently not being processed.However, such synchronization needs additional resources.Fig.5shows the part of pipeline chain corresponding to odd indices of DCT coefficients (lower part in Fig.4).

Fig. 6 .
Fig. 6.Optimized AAN algorithm for indices4-7.Aredefinition and splitting of variables allowed a reduction of the chain length.A simple redefinition of nodes removes difficulties mentioned above.The B 4 node defined as the sum of A 4,5 nodes requires a simple lpm_add_sub mega-function.The D 4 node with currently inverted sign allows using lpm_add_sub in E 4 performing a subtraction.The D 64 node from Fig.5can be split into the subtraction C 64 and the multiplication D 64 in the next clock cycle (Fig.6).

Fig. 7 .
Fig. 7. Optimized, shorter pipeline chain based on the classical approach.The reduction of the length of the chain at the cost of an additional multiplier.

Fig. 9 .
Fig. 9.The structure of sub-triggers.The DCT coefficients Xk are not directly calculated.They have been replaced by a boundary of the acceptance lane: upper and lower thresholds H 15 × θ Hk and H 15 × θ L k , respectively.Signals between that thresholds (two comparators + AND gate) generate preliminary sub-triggers, which are next summed and compared with the arbitrary Occupancy level.If an amount of "fired" preliminary sub-triggers is above the selected Occupancy, the final sub-trigger is generated for the next processes.It is enabled/disabled depending on the veto variable, verifying the minimal amplitude of the input signals to keep the trigger rate on the reasonable level and to prevent the saturation of the transmission channel.

393AnFig. 10 .
Fig. 10.Simulation of the 1-fold spectral trigger simultaneously with the 3-fold threshold trigger.The length of the shift registers = 16.Data in the Ext_ADC0 channel corresponds to a muon signal with a 1-time-bin rising edge, 11-time-bins attenuation tail and with a constant pedestal = 40 ADC-counts.Together with the begin of the muon peak (at 23.075 µs), two neighboring channels Ext_ADC1,2 are driven artificially to 150 ADC-counts to generate the standard threshold trigger based on the 3-fold coincidence.The internal PLL clock = 80 MHz.The internal standard threshold trigger appears 5 clock cycles later (+62.5 ns).The nodes lpm_ff:$00000|dffs -lpm_ff:$00030|dffs correspond to the shift register x 15 ,...,x 0 .The system is tuned for the Shape_A recognition (two 1st time bins on the pedestal level).Ena_A_reg is generated (+200 ns = 16 clock cycles) due to the amplitude of the signal (140 ADC-counts) is above the veto threshold.It is delayed next 15 cycles to be synchronized with SUB_TRIG_Occ.Sub-triggers are generated 27 clock cycles (+337.5 ns) after the rising edge.A calculation of the Occupancy takes next two clock cycles.29 clock cycles after the rising edge due to a coincidence of the Occupancy and Ena_DCT_del (inversion of the veto) the SUB_TRIG is generated.Finally it appears in the same position as 3-fold coincidence threshold trigger 31 clock cycles later.Final_DCT trigger corresponds to the possible coincidence with a neighboring DCT "engines".If the standard threshold trigger(based on 3-fold coincidence) appears next any triggers are ignored though 768 clock cycles.
Fig. 11.A scheme of the final spectral trigger.The shift register presented here has an extended length = 24 stages to cover longer time window.However, for a sampling frequencies f s ≤ 100 MHz 16 stages and T ≥ 150 ns the window is wide enough for an analysis of horizontal showers.If signal shifted in the register chain matches the expected patterns for 4 consecutive time bins i.e. corresponding to ADC shapes in Fig. 3 (1st row, 3 first graphs.The 4th pattern is exactly the same as the 3rd one.The amplitude of the signal decreases, but the DCT coefficients remain the same (still an exponential attenuation).

Fig. 12 .
Fig. 12. Shapes of signals with various attenuation factors and two first time bins on the pedestal level

Fig. 13 .
Fig. 13.Coefficients for signals with various attenuation factors and two first time bins (left) and only one time bin (right) on the pedestal level An Optimization of 16-Point Discrete Cosine Transform Implemented into a FPGA as a Design for a Spectral First Level Surface Detector Trigger in Extensive Air Shower Experiments 7 385 An Optimization of 16-Point Discrete Cosine Transform Implemented into a FPGA as a Design for a Spectral First Level Surface Detector Trigger in Extensive Air Shower Experiments www.intechopen.com An Optimization of 16-Point Discrete Cosine Transform Implemented into a FPGA as a Design for a Spectral First Level Surface Detector Trigger in Extensive Air Shower Experiments 9 27) 389 An Optimization of 16-Point Discrete Cosine Transform Implemented into a FPGA as a Design for a Spectral First Level Surface Detector Trigger in Extensive Air Shower Experiments www.intechopen.comAfter a scaling according to (15) we can introduce the new set of variables for the 3 rd pipeline stage: An Optimization of 16-Point Discrete Cosine Transform Implemented into a FPGA as a Design for a Spectral First Level Surface Detector Trigger in Extensive Air Shower Experiments 13 391 An Optimization of 16-Point Discrete Cosine Transform Implemented into a FPGA as a Design for a Spectral First Level Surface Detector Trigger in Extensive Air Shower Experiments www.intechopen.comallows implementation of fast multipliers from the DSP blocks and calculation of products in a single clock cycle.θ L k and θ H k are lower and upper scaled thresholds respectively, which are set as external parameters.
).So, a performance 395 An Optimization of 16-Point Discrete Cosine Transform Implemented into a FPGA as a Design for a Spectral First Level Surface Detector Trigger in Extensive Air Shower Experiments www.intechopen.com of a signal requires a single clock cycle only.All routines are fast enough to work with 100 MHz sampling without an additional pipeline stages and they do not introduce an additional latency.