Open access peer-reviewed chapter - ONLINE FIRST

Short-Latency Evoked Potentials of the Human Auditory System

Written By

Gijsbert van Zanten, Huib Versnel, Nathan van der Stoep, Wiepke Koopmans and Alex Hoetink

Submitted: November 21st, 2021Reviewed: December 15th, 2021Published: March 17th, 2022

DOI: 10.5772/intechopen.102039

Human Auditory System - Function and DisordersEdited by Sadaf Naz

From the Edited Volume

Human Auditory System - Function and Disorders [Working Title]

Dr. Sadaf Naz

Chapter metrics overview

36 Chapter Downloads

View Full Metrics


Auditory Brainstem Responses (ABR) are short-latency electric potentials from the auditory nervous system that can be evoked by presenting transient acoustic stimuli to the ear. Sources of the ABR are the auditory nerve and brainstem auditory nuclei. Clinical application of ABRs includes identification of the site of lesion in retrocochlear hearing loss, establishing functional integrity of the auditory nerve, and objective audiometry. Recording of ABR requires a measurement setup with a high-quality amplifier with adequate filtering and low skin-electrode impedance to reduce non-physiological interference. Furthermore, signal averaging and artifact rejection are essential tools for obtaining a good signal-to-noise ratio. Comparing latencies for different peaks at different stimulus intensities allows the determination of hearing threshold, location of the site of lesion, and establishment of neural integrity. Audiological assessment of infants who are referred after failing hearing screening relies on accurate estimation of hearing thresholds. Frequency-specific ABR using tone-burst stimuli is a clinically feasible method for this. Appropriate correction factors should be applied to estimate the hearing threshold from the ABR threshold. Whenever possible, obtained thresholds should be confirmed with behavioral testing. The Binaural Interaction Component of the ABR provides important information regarding binaural processing in the brainstem.


  • auditory evoked potential
  • auditory brainstem response
  • ABR
  • click evoked ABR
  • frequency-specific ABR
  • objective audiometry

1. Introduction

Auditory Evoked Potentials (AEP) are electric potentials from the auditory nervous system that can be evoked by presenting abrupt acoustic stimuli to the ear. Registration of the electric potential as a function of time after stimulus presentation shows a reproducible pattern of waves that occur at specific time points after stimulus onset. The time between stimulus onset and occurrence of an extreme value of a wave is called latency. As can be appreciated from Figure 1, responses span a time window of several orders of magnitude ranging from several milliseconds to a second. This wide range can be divided into three time-windows reflecting different latency ranges. Registrations within these different time-windows are generally called Auditory Brainstem Response (ABR) for short time-windows up to 8 ms, Middle Latency Auditory Evoked Potentials (MLAEP) from 8 ms up to approximately 40 ms, and Long Latency Auditory Evoked Potentials (LLAEP) for time-windows of 40 ms and longer. In this chapter we will focus on short latency ABR responses.

Figure 1.

Impression of registration of an auditory evoked potential. The abscissa shows latency in ms after stimulus onset on a logarithmic scale. The ordinate shows the amplitude of the electric potential in μV.

Figure 2 shows the results of a PubMed search with terms “auditory” and “potential” and “brain stem” and “human” (the latter both in text and as mesh term). It can be appreciated that the first paper mentioning “auditory potential” was published in 1948, but it was not until the early 1970s that the subject generated a substantial number of publication year by year. In the early 1970s, Jewett and Williston [1] introduced labeling of vertex-derived positive extremes of the ABR waves with roman numerals. They also established that these waves are far-field potentials from subcortical structures, providing indirect evidence that wave I is volume-conducted from the eight cranial nerve. Furthermore, they concluded that “waves I through VI have sufficient reliability to be worthy of establishing clinical and experimental norms”. This makes them, and particularly wave V, suitable for objective audiometry based on wave occurrence and latency. Picton et al. [2] extended ABR nomenclature by introducing the prime for the vertex-negative extreme following a positive extreme. Thus V′ identifies the vertex-negative extreme following vertex-positive extreme V. In this chapter, we will refer to the vertex-positive extremes as peaks. The first intracranial recordings in humans were, to our knowledge, reported by [3, 4]. In the first study, potentials were recorded from the intracranial part of the auditory nerve in patients undergoing operations for cranial nerve disorders. The results indicated that the auditory nerve gives rise to the first two of the peaks in the scalp-recorded ABR and not to only the first peak. The latter study concluded on the basis of in-depth recordings during brain surgery that waves II and III are primarily generated within the pons, with possible contributions from the auditory nerve. Waves IV and VI originate from the pons and the medial geniculate body respectively. In Section 2 we will discuss the sources of the ABR more extensively.

Figure 2.

Number of publications with search terms “auditory” and “potential*’ and “brainstem” (solid line) and “auditory” and “potential*” and “brainstem” and “audiometry” (dashed line). The term “Human” was used as a search term both in full text and as a Mesh term.

Clinical application of ABRs includes identification of the site of lesion in retrocochlear hearing loss, establishing functional integrity of the auditory nerve, and objective audiometry. With the advent of Magnetic Resonance Imaging (MRI) for the detection of acoustic neuroma, the clinical use of ABR for this purpose has declined. ABR remains an important tool, however, for establising neural functional integrity in cases of suspected auditory neuropathy and objective audiometry in newborns. Section 3 will give an overview of all aspects of clinical ABR measurements.

Many countries have established Universal Newborn Hearing Screening Programs for the identification of children with permanent congenital hearing loss. Outcomes of these programs include a lower age of identification, lower age of provision of amplification, and better speech production and perception [5]. Infants who do not pass newborn hearing screening are referred for diagnostic audiological assessment to determine the degree and type of hearing loss, and hearing loss configuration. Hearing thresholds in newborns are typically estimated by using ABR for objective audiometry because behavioral techniques such as Visual Reinforcement Audiometry (VRA) or Conditioned Play Audiometry (CPA) are not feasible at a very young age. Another application of ABR is the detection of ototoxicity in young children that are treated with cisplatin for cancer or (concomitantly) with aminoglycosides or glycopeptides antibiotics for infections. Section 4 will discuss the application of frequency-specific stimuli for objective audiometry in these patient groups. Finally, in Section 5 we will discuss an example of the application of binaural ABR measurements as an objective measure of directional hearing ability.


2. Neural sources underlying the ABR

The structures that contribute with their stimulus-evoked electrical activity to the ABR are the auditory nerve, cochlear nucleus, superior olive complex, and the lateral lemniscus. These structures will be briefly described with respect to their physiological responses and function.

Comprehensive overviews are provided for instance in [6]. Since the ABR is often used, both in the clinic and in animal experiments, to assess hearing loss caused by damage in the cochlea, that structure is included.

2.1 Description of pathway

Sound reaches the cochlea via the outer ear canal, tympanic membrane, and middle-ear ossicles. The sensory organ in the cochlea, known as the organ of Corti, is located on the basilar membrane, which stretches from the base near the footplate of the stapes to the apex. Due to gradients of its mechanical properties from base to apex the basilar membrane functions as a frequency filter bank and it is tonotopically organized: it maximally vibrates to high frequencies of the sound at the base and to low frequencies towards the apex, and each place along the basilar membrane corresponds to a frequency it is most sensitive to, a characteristic frequency (CF). Vibrations start at the base and travel towards the apex, a phenomenon known as the traveling wave. Consequently, cochlear responses occur faster after stimulus onset to high frequencies than to low frequencies.

In the organ of Corti, two types of sensory hair cells are distinguished: inner and outer hair cells (IHCs and OHCs, respectively), which are arranged in four rows in the ratio 1:3 and which differ distinctly in function. The IHCs act as mechano-electrical transducers passing through the acoustical information to the nerve, and the OHCs act as amplifiers, increasing detection sensitivity by 40–50 dB and increasing frequency selectivity. In both types of hair cells, acoustical vibrations are converted to electrical potentials. In IHCs these receptor potentials trigger action potentials in the nerve. For that purpose, each IHC is innervated by 10–20 afferent auditory nerve fibers, which are myelinated and which systematically vary in spontaneous rate (SR) and the threshold at their CF [7], the latter allowing for a wide dynamic range to be encoded. In the OHCs, the receptor potentials trigger the cells to contract and expand, and this motility is thought to amplify the basilar membrane vibrations, in particular at low sound levels. Irrespective of the mechanisms, OHC loss leads to a threshold shift of 40–50 dB and deterioration of frequency tuning. Each OHC is innervated by a single unmyelinated afferent fiber, and it shares this fiber with several other OHCs. These fibers have very high thresholds (>90 dB SPL). The great majority of the afferent auditory nerve fibers (~95%) receive input from the IHCs. An auditory nerve of a young normal-hearing subject contains about 35.000 fibers.

Action potentials that are generated at the IHC synapse are propagated along the auditory nerve to the cochlear nucleus (CN). The nerve branches to three divisions of the nucleus: anterior ventral cochlear nucleus (AVCN), posterior ventral cochlear nucleus (PVCN), and dorsal cochlear nucleus (DCN). The AVCN contains for the large part bushy cells which show similar responsiveness as the auditory nerve fibers. Their onset response latencies are ~0.6 ms longer than that of the nerve fibers [8]. Notably, the timing of the action potentials is more precise than that of the auditory nerve, i.e., when stimuli are presented repetitively, the action potentials have a very similar latency. The PVCN contains, among other cell types, multipolar cells which show so-called chopper responses with longer latencies than the bushy cells. The frequency tuning in AVCN and PVCN is similar to that in the auditory nerve. The DCN has a complex circuitry of various cell types including inhibitory interneurons. Consequently, many DCN neurons show frequency tuning that is characterized by excitatory responses to limited frequency-sound level combinations, and inhibitory responses to a wide range of frequencies and levels.

The bushy cells in the AVCN project to the superior olivary complex (SOC), which is the first station along the auditory pathway to combine input from both ears [9]. Specifically, the spherical bushy cells send their precise phase-locked action potentials to both ipsi- and contralateral medial superior olive (MSO) and to the ipsilateral lateral superior olive (LSO); globular bushy cells project to the contralateral medial nucleus of the trapezoid body (MNTB) from where inhibitory input is delivered to the LSO. Receiving well-timed input from both ears, neurons in the MSO are tuned to interaural time differences (ITD), and receiving ipsilateral excitatory and contralateral inhibitory input LSO neurons are sensitive to interaural level differences (ILD).

The next station in the auditory brainstem is the lateral lemniscus (LL), which globally can be distinguished in a ventral nucleus (VNLL) processing monaural information and a dorsal nucleus (DNLL) processing binaural information. The VNLL receives input from the contralateral CN, and the DNLL receives input from ipsilateral MSO and bilateral LSO.

Monaural and binaural pathways from each of the above-described brainstem nuclei converge in the inferior colliculus (IC). It allows the IC to process several auditory features including basic spectrotemporal features [10] and 2-dimensional spatial information [11].

2.2 Contribution of various nuclei to ABR

The ABR waveform is commonly described as consisting of five peaks. Peaks III and V typically dominate peak II and IV, respectively, and are the ones to be best observed in daily practice in a clinic or laboratory. Peak I appears more prominently in animals than in humans, where it fades faster with decreasing stimulus level than peaks III and V (see also Section 3.6). Electrical activity from the auditory nerve and brainstem nuclei contributes to the ABR. A first-order approach to understand which neural population corresponds to which peak, is to consider the sequence of nuclei in the pathway. The inter-peak interval of approximately 1.0 ms agrees with the axonal conduction time and synaptic delay between the generation of action potentials at two successive neurons. Indeed, as summarized by [12] for the human ABR partly based on intraoperative recordings, peak I reflects the activity of the auditory nerve, peak III that of the CN, peak IV the SOC, and peak V the LL. Peak II is generated by the central part of the auditory nerve, likely where it branches to the three CN divisions. In smaller mammals used in auditory research like gerbils, mice, and guinea pigs, rather four than five peaks are distinguished with peak IV being analogous to peak V of the human ABR [12]. A fifth peak would then reflect responses in IC, as a correspondence to IC evoked potentials in mice indicated [13]. Based on a series of careful lesion and modeling studies of click-evoked ABRs in cats, [14] linked peak I to the auditory nerve, II to the globular bushy cells in AVCN, III to spherical bushy cells and cells driven by globular cells, IV to MSO principal cells, and V to cells driven by MSO principal cells.

In a secondary approach, one should consider that the early stations besides contributing to early peaks can also contribute to later peaks. We consider the ABR evoked by the most commonly used stimulus, a broadband click. As a consequence of the traveling wave mechanics, the click response latency in the auditory nerve is shortest for high-CF neurons and increases with decreasing CF [15], which leads one to conclude that high-CF fibers contribute to wave I [14]. The low-CF fibers with longer latencies and multi-peaked responses (with inter-spike intervals of 1/CF) therefore may contribute to later waves. In particular for high click levels, the high-CF fibers show second firings about 1 ms after the first action potential, an interval that is related to the neural refractoriness [16], and notably, similar to the ABR inter-peak interval. The same notion applies to the CN bushy cells, i.e. those with lower CFs have longer click latencies and may contribute to later peaks.

2.3 Summing contributions from the various sources

The following factors determine the extent to which a neural population contributes to the ABR: the number of responding neurons, the discharge probabilities of the individual neurons, the discharge latencies, the synchronization of discharges between neurons, the synchronization of the individual neuron, and the unit response (UR). How action potentials of a neural population shape an ABR wave is illustrated in Figure 3 by the compound action potential (CAP), which reflects the auditory nerve response, thus analogous to wave I of the ABR. The CAP is mathematically described as the convolution of the compound discharge latency distribution (CDLD) and the UR, a concept introduced by [18]. An example of a CAP with corresponding CDLD and UR is shown in Figure 3, along with the convolution equation.

Figure 3.

Example of CAP and corresponding CDLD, which is constructed based on the CAP and depicted UR using the convolution equation. The UR is modeled after experimental guinea pig data [17], and the PSTH is an example of a recorded single-fiber response to 256 presentations of a monophasic condensation click of 100 μs.

The CDLD is the sum of the discharge probabilities of all responding auditory nerve fibers, which are typically recorded by poststimulus time histograms (PSTHs) acquired by presenting the stimulus a few 100 times. The discharge probability is the ratio of discharges and the number of stimuli. The synchronization is high when to each stimulus presentation the latency is very similar, thus resulting in a peaky PSTH, and the synchronization is low when the discharges are spread. The click-evoked PSTH in Figure 3 has a latency of about 2.0 ms with some discharges at 1.8 ms and some at 2.3 ms; the second peak reflects second discharges of the neuron. The CDLD will be relatively narrow when the PSTHs of the responding neurons have the same latency, and broad when the latencies vary among neurons. The latter applies to the auditory nerve since fibers with a low SR have typically longer latency than the fibers of high SR [15]. The UR is the potential at the recording electrode that results from a single action potential. Obviously, it determines both the size and shape of the AEP waveform, and it depends mostly on the distance of the electrode from the neural population. Generally, the UR depends on specific electrode configurations, the tissue between electrode and neurons, which includes electrodes at the skin, and skull characteristics. It is the factor that is most difficult to assess; for the CAP, it has been assessed by recording the potential at the CAP-recording site around the occurrence of action potential [17, 19, 20]. Each neuron may have its UR depending on the neuron’s location and morphometry. For the auditory nerve it can be assumed that the UR does not vary significantly with CF and SR [17], an assumption that generally works well when using the UR to predict CAPs [21, 22, 23]. The neural populations in the brainstem, however, will have URs that vary greatly between nuclei [14].

As an approximation, the CAP amplitude is proportional to the number of responding neurons (N in equation in Figure 3). Figure 4 shows amplitudes of CAPs evoked by an electrical current pulse as a function of the number of auditory nerve fibers in guinea pigs.

Figure 4.

Amplitudes of CAPs to electrical pulse stimulation (eCAPs) as a function of packing density of spiral ganglion cells. Data are acquired in 97 guinea pigs that are normal-hearing or ototoxically deafened with varying duration of deafness (2–14 weeks). Electrical pulses used were biphasic pulses with a phase duration of 50 μs and inter-phase gap of 30 μs and alternating polarity. Current levels are maximal, i.e., at or near saturation. The packing density reflects the number of surviving neurons. For methodological details see Ramekers et al. [24].

Most of these guinea pigs have been deafened and consequently, the number of neurons, quantified by packing density of the cell bodies in Rosenthal’s canal at different durations of deafness, varied widely [25]. Using an electrical stimulus, synchronization is expected to be large, and the great majority of surviving neurons are expected to respond, creating an ideal condition to test the convolution approximation. Indeed, the CAP amplitude significantly increases with the neural packing density, however, the amplitude varies enormously among guinea pigs, and the variance is only explained for 36% by the packing density. This outcome confirms that the number of responding neurons is an important factor, but at the same time it underscores the unreliability of amplitude as a measure of auditory evoked potentials including the ABR.

How do responses with different latencies add up? To address that question again the CAP provides a good illustration as shown in Figure 5.

Figure 5.

Illustration of summing of two waveforms with varying latencies. In the left column, the latency difference between the first and second contribution is 0.6 ms, and in the right column, the latency difference is 0.4 ms. The size of the contributions is unchanged. The resulting waveforms (bottom row) differ greatly in that the left one shows a clear second peak, whereas the right one shows only one peak. The waveforms here show CAPs, but the principle applies to ABRs as well.

The example shows two CAP contributions, with a ratio second/first of 0.25, and a latency difference of CDLD of 0.6 ms (left column) and 0.4 ms (right column). The difference of 0.2 ms has enormous consequences for the resulting waveforms. The left waveform shows two distinct waves (N1, P1, N2, P2) but the right waveform shows a merged P1-P2 while the N2 has vanished. It illustrates an often occurring phenomenon of ABRs that waves appear as merged components, therefore not showing the classical 5 waves.

The URs of the various brainstem nuclei are crucial for how the potentials add up. As the URs depend on recording sites, the effect of changing electrode sites is demonstrated in Figure 6 showing click-evoked ABRs in a normal-hearing guinea pig, first with skin needles as electrodes, second with screws implanted in the skull as electrodes. For the different click levels, the waveforms show clear differences.

Figure 6.

Click-evoked ABRs recorded from normal-hearing guinea pigs. Clicks consisted of monophasic pulses of 20 μs with alternating polarity, presented at a rate of 10.1/s. The levels indicate dB attenuation relative to ~110 dB pe SPL. Subcutaneous needle electrode configuration: Active electrode behind the ipsilateral pinna, reference electrode on the skull, rostral to the brain, and ground electrode in the hind limb. Transcranial screw electrode configuration: active electrode 1 cm posterior to bregma, and the reference electrode 2 cm anterior to bregma; as ground electrode a subcutaneous needle electrode in the hind limb was used. For methodological details see [24].

2.4 Effect of hearing loss

ABR waveforms vary with degree and types of hearing loss. We discuss two different types of common pathologies with respect to the consequences for the click-evoked ABR, OHC loss in basal cochlear regions, and synaptopathy.

OHC loss in basal cochlear regions, for instance, caused by ototoxic medication, noise trauma, aging, or any combination of these, leads to high-frequency hearing loss and to degradation of frequency tuning, which both have consequences for click-evoked responses of the auditory nerve. First, the latency increases with decreasing click level will be larger than normal, since for the lower levels the neurons from apical regions, which have late responses because of the traveling wave delay, will dominate the contributions to the ABR. Second, the difference in auditory nerve responses between rarefaction and condensation clicks (see Section 3.4 on stimulus polarity), which is negligible in normal ears, will increase in particular with respect to latency. Basal neurons in regions of OHC loss show decreased sensitivity for high frequencies and increased sensitivity for low frequencies [26], which can be characterized as double frequency tuning, leading to click responses with short latencies typical for high-CF click responses and latency differences between rarefaction and condensation clicks reminiscent of low-CF responses [27]. While this polarity asymmetry occurs at high click levels, at low levels the dominating low-CF responses will cause a latency difference in responses between the rarefaction and condensation polarity. Third, shallow frequency tuning may lead to increased synchronization [27], which can be explained by considering the click response as an impulse response of which the frequency tuning is the Fourier transform.

In animals, it has been demonstrated that aging leads to loss of neurons because of damage to the IHC synapses while the IHC itself remains functional [7]. Exposure to noise also when not leading to IHC loss augments this cochlear synaptopathy. The amplitude of wave I of the ABR has been found to be strongly correlated to the survival of IHC synapses in mice [28] reminiscent of the correlation between eCAP amplitude and neural survival in Figure 4. In humans, neural degeneration also occurs with increasing age, and speech perception has been shown to be affected by the neural loss as quantified in a post-mortem histological analysis [29]. The low-SR neurons, which have high thresholds, are especially vulnerable for synaptopathy and therefore the ratio of wave I amplitudes at high and low stimulus levels is regarded as a measure of synaptopathy. Carcagno and Plack [30] underscored the use of this ABR measure as they found a decrease in the wave I ratio with age. Alternatively, the ratio of wave I and wave V amplitudes is sometimes used.


3. Clinical ABR measurements

3.1 Recording ABRs

For clinical ABR measurements, an acoustical stimulus is presented to the patient and electrodes mounted to the skin of the head record the neural responses. Generally, short-duration stimuli are used, and the response is acquired in a time-window of about 10–20 ms starting at stimulus onset. High-quality recordings require good contact between skin and recording system. Therefore, electrodes should be applied to the skin carefully to minimize the electric impedance between electrode and skin. Many different types of electrodes are available, both disposable and non-disposable. The quality of the electrodes and their application is of utmost importance for a high ABR recording quality. Essential is that inter-electrode impedance is kept below 5 kΩ, preferably below 3 kΩ. If this cannot be achieved, then at least the interelectrode impedances should be symmetric, for instance all-around 8 kΩ, as will be explained below. Inter-electrode impedance should be kept stable during the ABR assessment, so well-fixated electrodes are required.

For single-channel ABR-recording, one electrode (the so-called active electrode) is attached to the skin, generally at the midline of the head somewhere between forehead and nape. The ABR amplitude is higher when its position is closer to the vertex. A second electrode (also called the reference electrode) is attached at ear level, for instance, close to the upper border of the mastoid plane. The position of the third (ground) electrode is not very critical. Often, an off-midline location on the forehead is chosen (see Figure 7), but for a single-channel recording, the ear-level position at the contralateral ear can also be used. In that case, when changing stimulation side, the reference and ground electrode should be exchanged.

Figure 7.

Measurement setup for a single channel ABR measurement with an electrode on the midline, an electrode at ear level, and a grounding electrode off mid-line on the forehead.

ABR-potentials are also extremely small in comparison to other (interfering) potentials picked up by the electrodes. Therefore, high recording quality requires knowledge about the possible origins of these interferences and methods to reduce their strengths.

3.2 Amplifying, filtering, and averaging of the ABR signal

As ABR-potentials are in the range of 0.002–2 μV, amplification by a factor of 10,000–100,000 is required before the signals can be processed and interpreted. To achieve the high amplification factor that is required, differential amplifiers must be used. This type of amplifier has three connectors, two for input to the amplification channel (so-called plus and minus inputs) and one ground connector. Commonly, the midline (active) electrode is connected to the plus input, and the ear-level (reference) electrode is connected to the minus input. The third (ground) electrode is connected to the ground connector. In multi-channel ABR-recording systems, different channels share the active and the ground electrode. For each extra channel, only a separate reference electrode is needed. Often an ipsilateral ear-level electrode is used as a reference electrode for channel 1 and a contralateral ear-level electrode is used as a reference electrode for channel 2.

A differential amplifier suppresses the contribution of potential variations that are (approximately) common to the plus and minus input connectors, thereby reducing their contribution to the amplifier’s output signal. The common-mode rejection ratio is the amplifier characteristic that reflects to what extent this suppression is successful. It should be at least 90 dB for high-quality ABR measurements. The common-mode rejection ratio degrades significantly when electrode impedances are too asymmetric, for instance, 2 kΩ for the reference electrode against 10 kΩ for the active electrode. So, inter-electrode impedance symmetry is essential for reaching a common-mode rejection ratio as high as is specified for the amplifier that is used.

Overloading the amplifier is unavoidable in ABR recording. Activation of head and neck muscles, for instance, may produce potential variations (EMG potentials) between the plus and minus connectors of 10–50 mV. To avoid overloading the first stage amplifier with an amplification factor of say 1000, the amplifier’s output signal should be able to vary up to 10–50 V without saturating. Such a large output dynamic range of the amplifier requires a high power-supply voltage to avoid too many overloads. If the power supply of the amplifier cannot accommodate these high output levels, the output signal will saturate at its maximum or minimum extreme values and stay at that level for a time. Saturation generally occurs a little below the power-supply voltage. For instance, with a power supply voltage of 15 V, just below +15 V or −15 V.

One of the most important characteristics of the amplifier is its behavior when it recovers from overloads. This behavior is never listed in the specifications of the amplifier because the specifications only describe the normal functioning of the amplifier and not how it behaves after an overload. Some amplifiers show recovery behavior that makes them unfit for ABR recording, especially when the recovery potential waveform is a damped resonant. We advise to check this behavior of the amplifier, by using a single overloading pulse as the input signal.

Every ABR measurement system uses an analog bandpass filter in the input stage to suppress all non-ABR-related content of the input signal. Depending on the slopes of the passband, appropriate high pass and low pass cut-off frequencies should be selected. The steeper the slope in dB/oct, the lower the value of the high pass cut-off frequency should be. For a slope of 24 dB/oct, the high pass cut-off frequency should be as low as 10–15 Hz. A quadrupling of that range is allowed for each halving of the slope. For example, for 6 dB/oct filter slope, the high pass cut-off frequency should be set at 160–240 Hz. The low pass cut-off frequency is less critical, as long as it is above 2 kHz for a slope of 6 dB/oct, with a quadrupling per doubling of the slope. After analog-to-digital conversion that occurs at some point in the signal processing, various filter designs can be used providing such filtering uses linear phase filters. In addition to amplification and filtering, four other methods are used to suppress the interfering potentials as much as possible to improve the quality of the ABR-recording: averaging, artifact-rejection, windowing, and alternating the stimulus polarity.

Most of the interfering potentials are not synchronous with stimulus onset but start randomly at a certain time point after stimulus onset. Consequently, at a specific time-point after stimulus onset, the measured electric potential amplitude consists of the ABR amplitude (signal) at that time-point and the sum of randomly distributed interfering potential amplitudes (noise). The first component (signal) is very weak compared to the second (noise). The signal, however, is causally related to the stimulus, while the noise varies randomly in amplitude and sign. By averaging the responses of many repeated fixed-level stimulus presentations, the values of the noise potential amplitudes tend to cancel each other, resulting in an average value of zero. The average of the ABR component, however, is not zero and its relative contribution increases with the increasing number of stimulus presentations. When calculating the average value after 1000–2000 stimulus presentations, the ABR component is generally stronger than the noise component and the ABR waveform emerges from the noise. For higher ABR amplitudes, commonly at higher stimulation levels, the number of averaged single-stimulus responses can be lower than at lower stimulation levels to arrive at the same ABR signal-to-noise ratio.

Averaging the response of multiple stimulus presentations increases the signal-to-noise ratio drastically. The signal-to-noise ratio can be improved even more by a non-linear filter process called Artifact Rejection (AR). This process imposes a lower and upper limit on the electrode potential values that are accepted as valid measurements during a single registration at a fixed stimulus level. The idea is that if this value is exceeded during that registration, the response is dominated by interference and does not reflect the auditory nerve and brainstem responses. The upper and lower limit values are commonly set symmetrically as + and − a specific voltage value called the AR level. If any of the values in the sequence exceeds the AR level, the whole sequence is rejected for averaging. ABR systems in general allow the setting of the AR levels in μV, so for instance +/− 15 μV. For good quality ABR recording the AR level should be somewhere between 15 and 25 μV. Some ABR systems allow specification of the number of times that the AR levels may be exceeded before rejecting the whole sequence. For good-quality ABR recordings this number should be low, close to zero. Other signal averaging systems do not use stimulus amplitude as an AR criterion, but the AR rate. For instance, say that to have arrived at 1500 accepted responses for averaging, 1650 stimuli had to be presented. In other words, the responses to 1500 stimuli were accepted and the responses to 150 stimuli were rejected. In that case, the rejection count was 150, and the rejection rate was 150/1650. Setting an AR rate instead of absolute response amplitude levels for AR may result in accepting averages that are dominated by a few contaminated responses with high potential amplitudes, for example of myogenic origin. In terms of statistics, this approach may lead to a higher type II error probability (i.e. the mistaken acceptance of a false null hypothesis). Therefore, we advise against the use of such averaging systems for clinical ABR assessment.

In unweighted averaging, every accepted response sequence after a stimulus presentation contributes equally to the average value after say 1500 stimulus presentations. In weighted averaging, however, each accepted response sequence is assigned a weight. This weight is calculated by some paradigm. For example, the weight could be determined by one over the variance of the sequence. This results in a final average with a larger contribution of the sequences with less interference (=lower variance). Manufacturers of ABR measurement systems generally do not specify the specific paradigm used in their system. Combining weighted averaging with AR is sometimes called Bayesian AR. This procedure uses weighted averaging for stimuli that are still within specified AR limits, assigning less weight to responses with higher amplitude. Responses with amplitudes that lie outside the AR limits are still rejected.

3.3 Identifying interfering potentials

To get a grip on the always present interference, one needs to know the origins of interfering potentials. The interfering noise can be synchronous to (or in sync with) the stimulus or not. In the first case, averaging does not help to reduce the amplitude of the interfering components. Furthermore, the interfering components can be of the physiologic origin or not.

Interfering potentials with a physiologic origin are potentials generated within the patient’s body, e.g. by muscles, the brain, the eyes. Muscle activations are the most powerful source. Due to the differential type of amplification, only muscles at the head cause significant interference. Their interference comes in two different kinds. (1) In sync with the stimulus, caused by the (strong) auditory stimulation used with the ABR recording. The muscles involved are located postauricular (the muscles that can move the pinna) and in the neck (the sternocleidomastoid muscle). (2) Not-in-sync with the stimulus, caused by muscle activation at the level of head and neck, with muscles of the neck and jaw as major sources. The brain is also a source of interfering potentials, albeit normally much weaker than myogenic potentials in the ABR-frequency band. All brain activity not related to the auditory system causes interfering potentials. The eye is also a weak source of interference in the ABR-frequency band.

Non-physiological interference can be introduced by the recording and stimulation system itself, by other (medical) devices coupled to the patient, and by irradiation from external sources. The ABR-system can introduce interference by (1) the auditory stimulator used for eliciting the ABR, the so-called stimulus-artifact, or (2) error or poor electrical design of the system.

Generally, the stimulator contains an electrodynamic loudspeaker that generates an electromagnetic wave resulting from its coil movements. This waveform mirrors the electrical stimulus waveform (more specifically, convolved by the stimulator’s impulse response). If this coil is close to electrodes or their leads an artefactual potential variation is introduced by electromagnetic induction. Obviously, this interfering potential is in sync with the stimulus and is not reduced in strength by averaging.

The most frequent causes of error are mains interference caused by ground loops originating in the amplifier and are caused by poor design of its power supply. For instance, the power supplies of the stimulus amplifier and the physiologic amplifier should be completely independent. If not, the supply voltage of the physiologic amplifier can suffer a dip when a strong stimulus is presented. Due to the extremely high amplification factor of the physiologic amplifier, even a very small dip can cause a significant output signal variation. This may incorrectly be interpreted as input signal variation. Another example: in a multi-channel recording system the power supplies of the amplifiers of different channels should be independent and mutually completely decoupled to keep the common-mode-rejection factors independent.

Coupling of the patient to other medical equipment, like a heart-lung monitor in the intensive care unit or operation theater, often causes ground loops confounding the physiologic amplifiers’ function with mains interference. The patient, the electrode wiring, and the pre-amplifier are also antennas that pick up the electromagnetic fields from the environment by induction. There is a multitude of possible sources, like radio broadcasting, wireless telephones, pagers, automatic doors, etc.

3.4 Reducing interference

Identification of the origins of the interfering signals requires inspection of the raw amplified electrode signal during the averaging process. This can be done by observing a free-running registration that is in sync with stimulus presentation.

When the difference in skin-electrode impedance is high for different electrodes (inter-electrode impedance), non-physiological interferences generate higher interfering potentials in the ABR measurement system. Therefore, keeping inter-electrode impedances below 5 kΩ and preferably below 3 kΩ, helps to reduce the interference induced by stimulus artifact and electromagnetic irradiation. If this interference is still too strong, it helps to lower the inter-electrode impedances even further down to under 1 kΩ.

The stimulus artifact has the waveform of the convolution of the electrical stimulus waveform and the stimulator’s impulse response. With any waveform of the stimulus, there is first compression or reduction of the air pressure in the ear canal, the air is first condensed or first rarefied. The stimulus polarity is named accordingly: condensation or rarefaction. By alternating the electric stimulus polarity in the series of say 1500 stimuli used for one stimulation level, the alternating waveforms of the stimulus-artifact cancel each other from one stimulus to the next, because these are in anti-phase. At higher stimulation levels, however, the impulse response of the transducer might be somewhat asymmetric as to the phase, and therefore subsequent stimulus artifacts do not cancel exactly anymore. As a result, a stimulus artifact will remain present in the averaged response. This will occur specifically at levels close to the output limits of the transducer and with damaged transducers (after a drop to the floor for instance).

Increasing the number of averaged (accepted) responses increases the signal-to-noise ratio of the resulting ABR waveform. This only holds, however, for stationary noise. In clinical measurements, ABR interfering noise is in general very non-stationary in character. Therefore, averaging more than 2500 sweeps generally does not result in further improvement of the signal-to-noise ratio.

As myogenic potentials generate the strongest interference, the ABR-recording quality can be greatly improved by reducing muscle tension in the patient. This can be done by several conservative methods. (1) Placing the patient in a relaxing position in a special chair or on a bed, with special attention for a relaxed head position. (2) Keeping the patient’s head position in the midline. Asymmetric pre-tension of both sternocleidomastoid muscles may lead to an asymmetric and stronger muscle-artifact in sync with the stimulation. (3) Showing a (soundless and non-thrilling) video at a height that forces the patient to steer the eyes to the midline of the lower half of the view field. When such measures do not suffice, additional (medical) measures can be taken, of course with medical authorization and/or control. (1) Giving relaxing drugs to the patient (obviously with authorized control). Some drugs, like ketamine, are unfit however because they provoke abnormal brain activity with higher interference in the ABR-frequency band as a result. (2) Giving full anesthesia with muscle relaxation and ventilation. In that case, however, care must be taken that the anesthesia is deep enough. Light anesthesia causes an enhancement in the higher-frequency components of the EEG, resulting in enhanced interference in the ABR-frequency band.

3.5 Recording strategy

To provide ABR-recordings with as much information as possible, the following procedures will help. (1) Make a two-channel recording at each stimulus level. (2) Create separate (sub)averages for different combinations of stimulus polarity, i.e. a (sub)average for condensation, rarefaction, and alternating polarity. (3) Create (sub)averages for test-retest measurements. (4) Record ABR responses at various levels of stimulation, spanning the (remaining) dynamic range of the auditory system for the side of stimulation, with five different levels if possible. (5) Present the different ABR recordings ordered vertically with the highest stimulus level on top. This creates an ABR pattern, that facilitates inspection of peak latency shift against stimulus intensity. If separate registrations for test-retest or condensation-rarefaction polarity are available, pairwise presentation per stimulus level is preferred. (6) Repeat steps 4–5 interactively during ABR assessment to arrive at the optimum result in the available time for assessment, “biding your time”. This way the next stimulation level to be measured can be chosen optimally.

In two-channel recordings, the active electrode of an amplification channel is commonly positioned at the midline of the head, e.g. the vertex, and the reference electrodes at ear-level. With the reference-electrode at the side of stimulation, the ipsilateral ABR is recorded. With the reference electrode at the ear opposite to the side of stimulation, the contra-lateral ABR is recorded. The ipsi- and contralaterally measured ABR waveforms differ in specific aspects that can help to identify the ABR waveform peaks I–V. The most important differences are (1) peak III has a somewhat shorter latency in the contra-laterally derived ABR-waveform; (2) peak V has a somewhat longer latency in the contra-laterally derived ABR-waveform. With the ipsilateral recording projected right above the contralateral one in the visual representation, a kind of trapezoidal shape is visible in the peaks III–V combination. This greatly helps identifying that combination, specifically if the peak I–II combination is difficult or impossible to identify, see Figure 8.

Figure 8.

Example of the simultaneous presentation of ipsi- and contralateral registration of ABR-response, showing the trapezoidal shape of the peak III–V complex.

The ABR waveforms for condensation and rarefaction stimulus polarities are not identical. This can only be made visible when responses for different stimulus polarities are recorded separately. A major problem with measuring the ABR responses for different stimulus polarities in different measurement runs is that due to the non-stationary nature of noise, these responses are measured under different interfering noise conditions. This can be avoided by measuring ABR responses with an alternating stimulus-polarity and storing the corresponding responses in separate data buffers. This allows creating subaverages for condensation and rarefaction stimuli that are acquired in similar noise conditions. When data of the different buffers are summed, the alternated average is still visible, but it can be split into two separate parts. Projecting the one superimposed on the other (with a different color for example) makes the differences between responses from condensation and rarefaction stimulus polarities visible. One obvious difference that stands out is the form of the stimulus-artifact, which is of opposite polarity. Concomitantly, if the cochlear microphonic response is detectable within the stimulus artifact, it will also show different polarity. Major differences can also occur in the morphology of III–V peak complex in cases of (steep) high-frequency cochlear hearing loss, as was explained in Section 2.4. These differences can be so large that identification of the III–V complex is ambiguous in responses obtained with alternating stimulus polarity, while identification is straightforward in the responses obtained with condensation or rarefaction polarity separately.

Additional information can also be derived by creating separate data buffers for alternating test and retest registration to again obtain subaverages acquired in similar noise conditions. Projecting test-retest subaverages on top of each other in different colors enables quick visual inspection of the stability of the acquired ABR responses, to determine if the ABR peaks robustly rise above the residual noise floor (see Figure 9). This can not only be judged subjectively, but the two subaverages also allow for quantitative calculations of various measures of similarity.

Figure 9.

An example of the presentation of test-retest subaverages. The upper registration shows a case with low test-retest reproducibility and the lower registration shows a case with high test-retest reproducibility.

In summary, sorting the single stimulus responses into four data buffers, subaveraging and making various combined or split views of the results, yields easily available information on the stability of the results and of the differences between condensation and rarefaction responses.

Preferably five or more responses for different stimulation levels are acquired to construct an ABR pattern. As measurement time is often precious, due to the requirement of the patient remaining in a relaxed condition, it is best to aim at first acquiring ABRs at higher stimulation levels and then at levels between 25 dB above and around the response threshold. In the lower-level range, the stimulus step size should be 10 or even 5 dB, while at levels far above threshold larger step sizes of 20 or even 25 dB can be used. Each succeeding stimulation level should be chosen as time-efficiently as possible. This can be achieved by constructing the ABR pattern each time acquisition at a specific stimulus level is completed. If a succeeding acquisition with a much lower stimulus level produces no ABR response, an educated guess should be made for the next higher level to be used.

The response threshold is defined as the lowest stimulation level at which reproducible response peaks (generally peak V) can be identified. At levels up to 20 dB above threshold and 5 dB below threshold, replication measurements are advised to confirm the presence or absence of response peaks. Note that for response threshold assessment, at least one acquisition must be obtained that shows no response peaks at all, preferably at a stimulation level just below (say 5 dB below) the lowest stimulation level that shows reproducible peaks in the response. Obviously, this is not necessary if response peaks are found at levels in the normal range of the response threshold, i.e. 0–20 dB(nHL). To enable good interpretation of the results for the various stimulation levels, it is very useful to order the recordings vertically according to decreasing stimulus levels, preferably in pairs of ipsi- and contralateral responses. This way of constructing the ABR pattern enables tracking of ABR peaks from high stimulus levels down to threshold, as is demonstrated in Figure 8.

3.6 Interpreting ABRs

The first step in ABR pattern interpretation is peak identification. The second step in the audiological use of the ABR pattern is determining its validity, i.e. whether the ABR pattern reflects the neural integrity of the auditory nerve and brainstem. The third step is determining the response threshold level. The fourth step is analyzing the relations between latency of the peaks and stimulus level.

First, identify peaks in the ABR pattern with equal or higher latency for decreasing stimulation level. At higher stimulation levels, say above 85 dB(nHL), peak latencies may be stable, at lower stimulation levels peak latencies increase with decreasing stimulation level. Commonly this increase is larger when the stimulation level approaches the response threshold. Identification of the peak I–III–V complex is commonly easiest, and even more so with 2-channel recording with the ipsilateral averaged response positioned above the contralateral averaged response as shown in Figure 8. A trapezoidal shape should be observable in this complex, which positively identifies the III–V complex. Reproducibility of amplitude and latency of a peak at a constant stimulation level is required for reliable peak identification. As explained above, a test-retest view of the response pattern is very helpful. Easy switching between the views on the overall average and test-retest sub-averages helps to achieve greater reliability of peak identification.

Next, the peak I–II complex should be identified. Commonly the complex is better identifiable in the ipsilateral recording than in the contralateral recording. At high stimulation levels, over 85 dB(nHL), peak I prevails in amplitude and peak II is visible as a kind of shoulder on the downslope of peak I. At lower stimulation levels peak II tends to prevail in amplitude and peak I is visible as a kind of shoulder on the upslope of peak II. The transition range is between 55 and 65 dB(nHL) in normal hearing. Below 55 dB(nHL) peak I is rarely visible, but peak II can be. Mistakenly identifying peak II as peak I, yields an abnormally short time interval between this peak I and peak V. Identification of the I–II complex can be difficult or impossible in cases of significant conductive hearing loss. Then one must rely on the identification of the III–V complex for interpreting the ABR results.

Before performing audiometry based on the ABR pattern, the neural integrity of the auditory nerve, which is the source of the measured peak I and II potentials, should be assessed. This can be done by measuring the inter-peak interval, i.e. latency differences between peaks. For adults with normal auditory nerve function, the I–V latency difference should be below 4.3 ms. Larger differences are suspicious and the reliability of the audiometric interpretation of the pattern is questionable. These limits are age-dependent, and for patients below 2.6 years, this limit value is higher. For term-born neonates it is 5.4 ms and for preterm neonates it is still higher. However, it must be kept in mind that absolute latencies can be prolonged due to a conductive hearing loss. In that case, the effective stimulation level of the cochlea is lower than the stimulation level by the amount of conductive loss. For each type of ABR system the normative values of the absolute latencies may differ somewhat, depending on the design and the stimulator used. Therefore, the absolute latencies of peak III and V at a specific stimulation level should preferably be compared to their normal range for that type of equipment setup.

With high-quality ABR-registrations, identification of the response threshold level is easy. In the case of moderate quality, identification is still possible but requires more expertise and experience. Two independent experienced judges will generally disagree by not more than 5–10 dB. For audiometric interpretation, the correspondence between the ABR threshold and pure tone threshold depends on the type of stimulus used, i.e. tonal or broadband. In the latter case, when a click stimulus is used, for example, the ABR threshold is strongly correlated with the pure-tone audiometric threshold at 3 kHz [31] as further discussed in Section 4. The latter being 10 dB less in dB(HL) than the ABR threshold in dB(nHL). In cases of very steep cochlear high-frequency hearing loss, the difference becomes larger, because the pure-tone frequency of highest correlation with the ABR threshold shifts downward. One should be on guard for this pitfall if shallower shapes of the ABR peaks are observed. For tonal stimuli, the relations between ABR threshold and pure-tone threshold depend strongly on the stimulus waveform used for eliciting the ABR.


4. Frequency-specific ABR

This section gives a brief overview of frequency-specific ABR techniques that are now commonly used to establish hearing thresholds in audiological assessment following the newborn hearing screening and discusses why these techniques may be considered appropriate.

Traditionally, 100 μs click stimuli are used to evoke ABR responses (Figure 10a, top left). There are a number of advantages of using this stimulus: (1) it generally results in well-formed and detailed responses, (2) it helps in determining auditory neuropathy, and (3) it generates relatively large responses and therefore responses can be obtained in a brief amount of time [32]. Various studies describe a good correlation between click-evoked ABR thresholds and behavioral thresholds in the 2–4 kHz range, e.g. [31, 32], with correlations as high as 0.94. However, other studies report issues with the use of click stimuli for threshold estimates and report a much poorer correlation [33, 34]. The click-evoked ABR may seriously over-or underestimate sensory hearing loss, depending on hearing loss configuration. Though click ABR thresholds correlate well with the 2–4 kHz region on a population level, this does not necessarily result in accurate threshold estimates for individual patients. Stapells & Oates attribute these issues to the broadband spectrum of clicks and conclude that the click-ABR threshold probably represents the “best” hearing in a wide frequency range [33].

Figure 10.

Waveform of ABR stimuli as recorded with an interacoustics eclipse loopback test (a) 100 μs click, 4, 2, 1, 0.5 kHz Blackman window tone-burst stimuli (b) broadband LS-CE chirp, 4, 2, 1, 0.5 kHz NB-CE chirp.

Over the years, several methods for obtaining frequency-specific ABR thresholds have been explored, for example, involving ipsilateral masking of frequency regions or derived response methods with filtered clicks. Hall gives a review of these various approaches [35]. The most common clinical approach for recording frequency-specific ABRs is more straightforward and involves brief tone stimuli, or tone-bursts. A tone-burst stimulus is a transient stimulus of typically 5 tone cycles within a Blackman window (Figure 10a), or a 2 cycles rise-time-1 cycle plateau-2 cycles fall-time envelope [35]. This stimulus configuration gives an acceptable trade-off between the short stimulus onset needed to evoke an auditory response, and the bandwidth needed to obtain frequency specificity. Several studies describe high correlations (0.85–0.95) between pure tone audiometry thresholds and tone-burst ABR thresholds in adults [36, 37] and in infants [34, 38] and the authors conclude that tone-burst ABR is a clinically feasible and accurate method of estimating the pure tone audiogram when appropriate correction factors are applied.

Larger and clearer ABR responses can be evoked by using chirp stimuli, mathematically designed to compensate for frequency-dependent traveling wave delays in the cochlea and to generate synchronous stimulation across a wide frequency region. These level-specific (LS) chirp stimuli generate larger amplitude responses than clicks or tone-bursts, thus increasing the signal-to-noise ratio and reducing test time [39]. Elberling and Don derived narrow-band (LS NB-CE) chirps from these broadband chirps with approximately one-octave bandwidth (Figure 10b) [40]. These LS NB-CE chirps facilitate frequency-specific ABR.

Ferm et al. found significantly larger ABR responses with LS NB-CE Chirp stimuli compared to tone-bursts and anticipated a vast reduction in test time for achieving a similar SNR [41, 42]. They also established correction factors, compensating for the offset between ABR threshold (dB nHL) and estimated hearing level (dB eHL), as well as threshold confidence intervals for these stimuli. These correction factors are currently in use in the British Newborn Hearing Screening Program (Guidelines for the early audiological assessment and management of babies referred from the Newborn Hearing Screening Programme. British Society of Audiology, 2014).


5. Binaural auditory brainstem responses

An important feature of the auditory system is the ability to determine the location of sound sources relative to the head. Information from two ears can be used to estimate the location of a sound source in the horizontal plane using ITD and ILD. Using these binaural cues, normal hearing individuals can localize with high accuracy and precision [43]. Auditory localization allows humans to quickly detect and orient towards relevant sounds in the environment. This can be important, for example, when trying to safely navigate through traffic by bike or when walking, or when trying to focus on a single conversation in a noisy environment.

Measuring auditory localization accuracy and precision in a clinical setting requires specialized setups with a large number of speakers and, ideally, eye- or head-tracking. Although objective measures of hearing ability are often applied in the clinic, objective measures of auditory localization are not frequently used or well-known. An interesting objective measure of auditory localization can be found in the Binaural Interaction Component (BIC) of the ABR since the later peaks (IV and V) originate from binaural nuclei SOC and LL (see Section 2.2).

The amount of binaural interaction between the ears can be used as an objective measure of binaural hearing. ILDs and ITDs can be presented via headphones by introducing level and time differences between the left and right channels of a stereo sound. The BIC can be obtained by subtracting the ABR to a stereo sound from the sum of the monaural left and right ABRs [44, 45]. In normal-hearing listeners, the binaural ABR and the monaural sum are not the same, resulting in a different waveform: the BIC (see Figure 11). The most prominent peak in the BIC is the first negative peak, often called DN1 (sometimes called beta). The amplitude and latency of the DN1 systematically vary with ILD and ITD in humans and animals [46, 47, 48]. The largest amount of interaction (the largest DN1 amplitude) is typically observed at an ILD of 0 dB and/or an ITD of 0 μs.

Figure 11.

The BIC is calculated by subtracting the binaural ABR from the sum of the monaural ABRs. Figure obtained from Laumen et al. [45].

The DN1 amplitude, and thus the amount of binaural interaction, gradually decreases with increasing ILD or ITD. The most likely sources of the DN1 are the MSO and LSO in the SOC [45].

Given that the BIC is a difference waveform and the fact that ABR peaks are typically of low amplitude, measuring the BIC requires a high signal-to-noise ratio in the binaural and monaural ABRs. Additionally, to obtain the DN1 amplitude for multiple ILDs requires a quite extensive testing and may be less practical in the clinic where less time may be available for measurements [48]. Some studies also report that the BIC is absent for some participants with normal localization skills (e.g. [49, 50]), making it difficult to rely on for individual diagnostic purposes in some cases. However, the BIC can be used to study the processing of binaural cues in the brainstem in various populations at a group level. For example, a study of children at risk for central auditory processing disorders (CAPD) showed that their BIC amplitude was reduced relative to normal hearing children [51]. Interestingly, the children in the CAPD group showed normal ABR thresholds, suggesting that binaural interaction can be specifically affected in certain conditions. That the presence of the BIC has some diagnostic value can be seen in the results of a study in which the presence of the BIC was used to detect children at risk for CAPD. The investigators could distinguish between children at risk for CAPD and those not at risk with a 76% sensitivity and specificity [52].

To conclude, the BIC of the ABR provides important information regarding binaural processing in the brainstem. Although some studies suggest that it may not be the best objective measure for diagnosing binaural hearing disorders at an individual level, it does provide a unique window into binaural cue interactions early in the auditory processing pathway.


6. Conclusions

Sources of the ABR are the auditory nerve and brainstem auditory nuclei. Clinical application of ABRs includes identification of the site of lesion in retrocochlear hearing loss, establishing functional integrity of the auditory nerve and objective audiometry. To help interpretation and establish reliability, separate subaverages may be obtained for ipsi- and contralateral registrations, and for test-retest reliability. Hearing threshold estimation of infants who are referred for audiological assessment after hearing screening relies on accurate estimation of hearing thresholds. Frequency-specific ABR using tone-burst or narrow band chirp stimuli is a clinically feasible method for this. Whenever possible, obtained thresholds should be confirmed with behavioral testing. The binaural interaction component of the ABR provides important information regarding binaural processing in the brainstem. Although some studies suggest that it may not be the best objective measure for diagnosing binaural hearing disorders at an individual level, it does provide a unique window into binaural cue interactions early in the auditory processing pathway.


  1. 1.Jewett DL, Williston JS. Auditory-evoked far fields averaged from the scalp of humans. Brain. 1971;94(4):681-696. DOI: 10.1093/brain/94.4.681
  2. 2.Picton TW, Hillyard SA, Krausz HI, Galambos R. Human auditory evoked potentials. I. Evaluation of components. Electroencephalography and Clinical Neurophysiology. 1974;36(2):179-190. DOI: 10.1016/0013-4694(74)90155-2
  3. 3.Møller AR, Jannetta PJ. Comparison between intracranially recorded potentials from the human auditory nerve and scalp recorded auditory brainstem responses (ABR). Scandinavian Audiology. 1982;11(1):33-40. DOI: 10.3109/01050398209076197
  4. 4.Hashimoto I. Auditory evoked potentials recorded directly from the human VIIIth nerve and brain stem: Origins of their fast and slow components. Electroencephalography and Clinical Neurophysiology. Supplement. 1982;36:305-314
  5. 5.Yoshinaga-Itano C, Manchaiah V, Hunnicutt C. Outcomes of universal newborn screening programs: Systematic review. Journal of Clinical Medecine. 2021;10(13):2784. DOI: 10.3390/jcm10132784
  6. 6.Palmer AR. Anatomy and physiology of the auditory brainstem. In: Burkard RF, Don M, Eggermont JJ, editors. Auditory Evoked Potentials: Basic Principles and Clinical Application. Baltimore: Lippincott Williams & Wilkins; 2007. pp. 200-228
  7. 7.Kujawa SG, Liberman MC. Synaptopathy in the noise-exposed and aging cochlea: Primary neural degeneration in acquired sensorineural hearing loss. Hearing Research. 2015;330:191-199. DOI: 10.1016/j.heares.2015.02.009
  8. 8.Rhode WS, Smith PH. Encoding timing and intensity in the ventral cochlear nucleus of the cat. Journal of Neurophysiology. 1986;56:261-286. DOI: 10.1152/jn.1986.56.2.261
  9. 9.Yin TCT. Neural mechanisms for encoding binaural localization cues in the auditory brainstem. In: Oertel D, Fay RR, Popper AN, editors. Integrative Functions in the Mammalian Auditory Pathway. New York, Berlin, Heidelberg: Springer; 2002. pp. 99-159
  10. 10.Versnel H, Zwiers MP, van Opstal AJ. Spectrotemporal response properties of inferior colliculus neurons in alert monkey. The Journal of Neuroscience. 2009;29:9725-9739. DOI: 10.1523/JNEUROSCI.5459-08.2009
  11. 11.Zwiers MP, Versnel H, Van Opstal AJ. Involvement of monkey inferior colliculus in spatial hearing. The Journal of Neuroscience. 2004;24:4145-4156. DOI: 10.1523/JNEUROSCI.0199-04.2004
  12. 12.Boettcher FA. Presbyacusis and the auditory brainstem response. Journal of Speech, Language, and Hearing Research. 2002;45:1249-1261. DOI: 10.1044/1092-4388(2002/100). Epub 2003/01/28
  13. 13.Land R, Burghard A, Kral A. The contribution of inferior colliculus activity to the auditory brainstem response (ABR) in mice. Hearing Research. 2016;341:109-118. DOI: 10.1016/j.heares.2016.08.008
  14. 14.Melcher JR, Kiang NY. Generators of the brainstem auditory evoked potential in cat. III: Identified cell populations. Hearing Research. 1996;93:52-71. DOI: 10.1016/0378-5955(95)00200-6
  15. 15.Versnel H, Prijs VF, Schoonhoven R. Single-fibre responses to clicks in relationship to the compound action potential in the guinea pig. Hearing Research. 1990;46:147-160. DOI: 10.1016/0378-5955(90)90145-f
  16. 16.Prijs VF, Keijzer J, Versnel H, Schoonhoven R. Recovery characteristics of auditory nerve fibres in the normal and noise-damaged guinea pig cochlea. Hearing Research. 1993;71:190-201. DOI: 10.1016/0378-5955(93)90034-x
  17. 17.Versnel H, Prijs VF, Schoonhoven R. Round-window recorded potential of single-fibre discharge (unit response) in normal and noise-damaged cochleas. Hearing Research. 1992;59:157-170. DOI: 10.1016/0378-5955(92)90112-z
  18. 18.Versnel H, Schoonhoven R, Prijs VF. Single-fibre and whole-nerve responses to clicks as a function of sound intensity in the guinea pig. Hearing Research. 1992;59:138-156. DOI: 10.1016/0378-5955(92)90111-y
  19. 19.Kiang NYS, Moxon EC, Kahn AR. The relationship of gross potentials recorded from the cochlea to single unit activity in the auditory nerve. In: Ruben RJ, Elberling C, Salomon G, editors. Electrocochleography. Baltimore: University Park Press; 1976. pp. 95-115
  20. 20.Prijs VF. Single-unit response at the round window of the guinea pig. Hearing Research. 1986;21:127-133. DOI: 10.1016/0378-5955(86)90034-1
  21. 21.Goldstein MH, Kiang NY-S. Synchrony of neural activity in electric responses evoked by transient acoustic stimuli. The Journal of the Acoustical Society of America. 1958;30:107-114
  22. 22.Strahl SB, Ramekers D, Nagelkerke MMB, Schwarz KE, Spitzer P, Klis SFL, et al. Assessing the firing properties of the electrically stimulated auditory nerve using a convolution model. Advances in Experimental Medicine and Biology. 2016;894:143-153. DOI: 10.1007/978-3-319-25474-6_16
  23. 23.Dong Y, Briaire JJ, Biesheuvel JD, Stronks HC, Frijns JHM. Unravelling the temporal properties of human eCAPs through an iterative deconvolution model. Hearing Research. 2020;395:108037. DOI: 10.1016/j.heares.2020.108037. Epub 2020/08/23
  24. 24.Ramekers D, Versnel H, Strahl SB, Smeets EM, Klis SFL, Grolman W. Auditory-nerve responses to varied inter-phase gap and phase duration of the electric pulse stimulus as predictors for neuronal degeneration. Journal of the Association for Research in Otolaryngology. 2014;15:187-202. DOI: 10.1007/s10162-013-0440-x
  25. 25.Kroon S, Ramekers D, Smeets EM, Hendriksen FG, Klis SF, Versnel H. Degeneration of auditory nerve fibers in guinea pigs with severe sensorineural hearing loss. Hearing Research. 2017;345:79-87. DOI: 10.1016/j.heares.2017.01.005
  26. 26.Liberman MC, Dodds LW. Single-neuron labeling and chronic cochlear pathology. III. Stereocilia damage and alterations of threshold tuning curves. Hearing Research. 1984;16:55-74. DOI: 10.1016/0378-5955(84)90025-x
  27. 27.Versnel H, Prijs VF, Schoonhoven R. Auditory-nerve fiber responses to clicks in guinea pigs with a damaged cochlea. The Journal of the Acoustical Society of America. 1997;101:993-1009. DOI: 10.1121/1.418057
  28. 28.Sergeyenko Y, Lall K, Liberman MC, Kujawa SG. Age-related cochlear synaptopathy: An early-onset contributor to auditory functional decline. The Journal of Neuroscience. 2013;33:13686-13694. DOI: 10.1523/JNEUROSCI.1783-13.2013
  29. 29.Wu PZ, O'Malley JT, de Gruttola V, Liberman MC. Primary neural degeneration in noise-exposed human cochleas: Correlations with outer hair cell loss and word-discrimination scores. The Journal of Neuroscience. 2021;41:4439-4447. DOI: 10.1523/JNEUROSCI.3238-20.2021
  30. 30.Carcagno S, Plack CJ. Effects of age on electrophysiological measures of cochlear synaptopathy in humans. Hearing Research. 2020;396:108068. DOI: 10.1016/j.heares.2020.108068
  31. 31.Van der Drift JF, Brocaar MP, Van Zanten GA. The relation between the pure-tone audiogram and the click auditory brainstem response threshold in cochlear hearing loss. Audiology. 1987;26(1):1-10
  32. 32.Gorga MP, Johnson TA, Kaminski JK, Beauchaine KL, Garner CA, Neely ST. Using a combination of click- and tone burst-evoked auditory brain stem response measurements to estimate pure-tone thresholds. Ear & Hearing. 2006;27(1):60-74. DOI: 10.1097/01.aud.0000194511.14740.9c
  33. 33.Stapells DR, Oates P. Estimation of the pure-tone audiogram by the auditory brainstem response: A review. Audiology & Neurotology. 1997;2(5):257-280. DOI: 10.1159/000259252
  34. 34.Stevens J, Boul A, Lear S, Parker G, Ashall-Kelly K, Gratton D. Predictive value of hearing assessment by the auditory brainstem response following universal newborn hearing screening. International Journal of Audiology. 2013;52(7):500-506. DOI: 10.3109/14992027.2013.776180
  35. 35.Hall JW. New Handbook of Auditory Evoked Responses. Boston: Pearson; 2007
  36. 36.Canale A, Dagna F, Lacilla M, Piumetto E, Albera R. Relationship between pure tone audiometry and tone burst auditory brainstem response at low frequencies gated with Blackman window. European Archives of Oto-Rhino-Laryngology. 2012;269(3):781-785. DOI: 10.1007/s00405-011-1723-7
  37. 37.Purdy SC, Abbas PJ. ABR thresholds to tonebursts gated with Blackman and linear windows in adults with high-frequency sensorineural hearing loss. Ear and Hearing. 2002;23(4):358-368
  38. 38.Vander Werff KR, Prieve BA, Georgantas LM. Infant air and bone conduction tone burst auditory brain stem responses for classification of hearing loss and the relationship to behavioral thresholds. Ear & Hearing. 2009;30(3):350-368. DOI: 10.1097/AUD.0b013e31819f3145
  39. 39.Elberling C, Don M. Auditory brainstem responses to a chirp stimulus designed from derived-band latencies in normal-hearing subjects. The Journal of the Acoustical Society of America. 2008;124(5):3022-3037. DOI: 10.1121/1.2990709
  40. 40.Elberling C, Don M. A direct approach for the design of chirp stimuli used for the recording of auditory brainstem responses. The Journal of the Acoustical Society of America. 2010;128(5):2955-2964. DOI: 10.1121/1.3489111
  41. 41.Ferm I, Lightfoot G, Stevens J. Comparison of ABR response amplitude, test time, and estimation of hearing threshold using frequency specific chirp and tone pip stimuli in newborns. International Journal of Audiology. 2013;52(6):419-423. DOI: 10.3109/14992027.2013.769280
  42. 42.Ferm I, Lightfoot G. Further comparisons of ABR response amplitudes, test time, and estimation of hearing threshold using frequency-specific chirp and tone pip stimuli in newborns: Findings at 0.5 and 2 kHz. International Journal of Audiology. 2015;54(10):745-750. DOI: 10.3109/14992027.2015.1058978
  43. 43.Middlebrooks JC, Green DM. Sound localization by human listeners. Annual Review of Psychology. 1991;42:135-159. DOI: 10.1146/
  44. 44.Dobie RA, Berlin CI. Binaural interaction in brainstem-evoked responses. Archives of Otolaryngology. 1979;105:391-398. DOI: 10.1001/archotol.1979.00790190017004
  45. 45.Laumen G, Ferber AT, Klump GM, Tollin DJ. The physiological basis and clinical use of the binaural interaction component of the auditory brainstem response. Ear and Hearing. 2016;37:e276-e290. DOI: 10.1097/AUD.0000000000000301
  46. 46.McPherson DL, Starr A. Auditory time-intensity cues in the binaural interaction component of the auditory evoked potentials. Hearing Research. 1995;89:162-171. DOI: 10.1016/0378-5955(95)00134-1
  47. 47.McPherson DL, Starr A. Binaural interaction in auditory evoked potentials: Brainstem, middle- and long-latency components. Hearing Research. 1993;66:91-98. DOI: 10.1016/0378-5955(93)90263-z
  48. 48.Riedel H, Kollmeier B. Auditory brain stem responses evoked by lateralized clicks: Is lateralization extracted in the human brain stem? Hearing Research. 2002;163:12-26. DOI: 10.1016/s0378-5955(01)00362-8
  49. 49.Sammeth CA, Greene NT, Brown AD, Tollin DJ. Normative study of the binaural interaction component of the human auditory brainstem response as a function of interau.ral time differences. Ear and Hearing. 2020;42:629-643. DOI: 10.1097/AUD.0000000000000964
  50. 50.Furst M, Bresloff I, Levine RA, Merlob PL, Attias JJ. Interaural time coincidence detectors are present at birth: Evidence from binaural interaction. Hearing Research. 2004;187:63-72. DOI: 10.1016/s0378-5955(03)00331-9
  51. 51.Gopal KV, Pierel K. Binaural interaction component in children at risk for central auditory processing disorders. Scandinavian Audiology. 1999;28:77-84. DOI: 10.1080/010503999424798
  52. 52.Delb W, Strauss DJ, Hohenberg G, Plinkert PK, Delb W. The binaural interaction component (BIC) in children with central auditory processing disorders (CAPD). International Journal of Audiology. 2003;42:401-412. DOI: 10.3109/14992020309080049

Written By

Gijsbert van Zanten, Huib Versnel, Nathan van der Stoep, Wiepke Koopmans and Alex Hoetink

Submitted: November 21st, 2021Reviewed: December 15th, 2021Published: March 17th, 2022