List of model parameters
In many psychoacoustical tasks, hearing-impaired subjects display abnormal audiograms and poor understanding of speech compared to normal listeners. Existing models that explain the performance of the hearing impaired indicate that possible sources for cochlear hearing loss may be the dysfunction of the outer and inner hair cells. In this study, a model of the auditory system is introduced. It includes two stages: (1) a nonlinear time domain cochlear model with active outer hair cells that are driven by the tectorial membrane motion and (2) a synaptic model that generates the auditory nerve instantaneous rate as a response to the basilar membrane motion and is affected by the inner hair cell transduction efficiency. The model can fit both a normal auditory system and an abnormal auditory system with easily induced pathologies.
- Cochlear model
- outer hair cell
- hearing impairment
- auditory nerve
When sound waves enter the ear, they cause the basilar membrane (BM) that is located in the inner ear to vibrate. Since each place on the BM is tuned to a specific characteristic frequency (CF), the BM is able to separate the frequency components of sounds. The BM vibrations excite both the outer hair cells (OHC) and the inner hair cells (IHC). The OHCs act as local amplifiers, while the IHCs transduce the sound-induced vibrations into electrical impulses that propagate up the auditory cortex through the fiber tracks of the auditory pathway where the neural information is processed in a set of nuclei located in the auditory brainstem.
Damage can occur to the auditory system at any point along the auditory pathway. One of the most common impairments is OHC loss, frequently due to noise exposure. Often, when there is OHC loss , it is followed by IHC loss. Various diseases or old age can also injure different neurons along the auditory pathway.
Hearing impairment is characterized by abnormal audiograms and poor understanding of speech. The most frequent complaint is the inability to understand speech in a noisy environment. In many psychoacoustical tasks, hearing-impaired subjects yield lower thresholds than normal listeners (review by Moore ). For example, in monaural experiments, hearing-impaired subjects perform poorly in frequency discrimination tasks and in signal detection with a noisy background.
Models explaining the performance of hearing-impaired people [e.g., 2–9] indicate that the possible sources for cochlear hearing loss are the dysfunction of the outer hair cells and the loss of inner hair cells. The dysfunction of the OHCs reduces the gain of the active mechanism, which then tends to broaden the tuning curve and decrease the nonlinear effects. However, these models do not adequately predict hearing impairment performance [10, 11].
The purpose of this chapter is to introduce a comprehensive, nonlinear time domain cochlear model [6, 12–14], followed by a model of the auditory nerve (AN) response [7, 13, 16, 17] that can be used to predict hearing abilities of people with normal cochlea as well as with abnormal cochlea that suffers from either OHC loss and/or IHC loss.
Quantitative psychoacoustical measures that determine the human ability to detect the smallest difference in the physical property of a stimulus are usually implemented by forced-choice experiments. This difference is referred to as a “just-noticeable difference” (JND). Siebert  showed that if one assumes that the brain is behaving as an optimal processor, then psychoacoustical JND measurements can be predicted from auditory nerve instantaneous rates. In this chapter, we use this approach to compare the model predictions to human hearing thresholds, both normal and impaired, in both a quiet environment and in the presence of background noise.
2. The human ear model
The mammalian ear is composed of the outer ear, the middle ear, and the inner ear. The outer ear includes the pinna, the ear canal, and the ear drum. The middle ear is an air-filled cavity behind the ear drum, which includes three small ear bones, the ossicles. The inner ear includes a snail-shaped structure, the cochlea (see schematic description in Figure 1A).The sound is directed by the outer ear through the ear canal to the eardrum. When sound strikes the ear drum, the movement is transferred through the three bones of the middle ear to a flexible tissue called the oval window, finally reaching the upper fluid-filled ducts of the cochlea (see Figure 1). The upper cochlear ducts are called scala vestibuli, and the bottom duct is referred to as scala tympani. The space between the top and bottom ducts is labeled as scala media.
The middle ear’s task is to match the impedance of the sound pressure in the air to that of the fluid. Movement of the fluid inside the upper cochlear duct results in a pressure difference between the upper and lower ducts. This pressure difference in turn causes the basilar membrane (the membrane that separates the scala tympani and scala media) to move.
Two types of auditory receptor cells inhabit the scala media, the inner and outer hair cells. The defining feature of those cells is the hair bundle on top of each cell. The hair bundle comprises dozens to hundreds of streocilia, which are cylindrical actin-filled rods. The streocilia are immersed in endolymph, a fluid that is rich in potassium and characterized by an endocochlear potential of +80 mV. The streocilia move with the basilar membrane displacement. Their deflection opens mechanically gated ion channels that allow any small, positively charged ions (primarily potassium and calcium) to enter the cell. The influx of positive ions from the endolymph in the scala media depolarizes the cell, resulting in a receptor potential. The roles of the OHCs and IHCs on the function of the cochlea are very different. While the OHCs act as local amplifiers, the IHCs innervate the auditory nerve. The OHCs lay on the basilar membrane, and their upper part is embedded in a gel-like membrane, the tectorial membrane (TM). An increase in the OHC receptor potential causes a decrease in its length , which in turn enhances the BM movement. The hair bundles of the IHC move freely in the scala media. The change in their receptor potential opens voltage-gated calcium channels that release neurotransmitters at the basal end of the cell, which trigger action potentials in the attached nerve.
Modeling the human ear requires a detailed model of the cochlea and the middle and outer ears. A common approach is to model the inner ear as a one-dimensional structure [e.g., 6, 14, 20–23] with the cochlea regarded as an uncoiled structure with two fluid-filled compartments with rigid walls that are separated by an elastic partition, the basilar membrane. The cochlear partition, whose mechanical properties are describable in terms of point-wise mass density, stiffness, and damping, is regarded as a flexible boundary between scala tympani and scala vestibuli. Thus, at every point along the cochlear duct, the pressure difference across the partition drives the partition’s velocity. By applying fundamental physical principles, such as the conservation of mass and the dynamics of deformable bodies, the differential equation for is obtained by [e.g. 6]
where is the BM displacement, represents the cross-sectional area of scala tympani and scala vestibuli, is the BM width, and is the density of the fluid in both the scala vestibuli and the scala tympani. The pressure on the BM () is a result of both the difference in fluid pressure and the pressure caused by the OHCs (). The relation between the pressures of BM, TM, and OHC is shown in Figure 1 , which can be interpreted as
The mechanical properties of both BM and TM are simulated as second-order oscillators that yield
where , , , , , and are the effective stiffness, damping, and mass per unit area of BM and TM, respectively (see Table 1). The TM displacement is defined as .
Since the OHCs lie between the two membranes, their displacement is considered as
Each OHC is modeled by two sections, the apical and basal parts. The apical part is directed toward the endolymph of the gap between the TM and the reticular lamina (RL), while the basolateral part is embedded in the perilymph next to the supporting cells that are aligned along the BM. When the OHCs’ stereocilia move due to the relative displacement of the BM and the TM, the conductance of the apical part of the OHC is affected, which in turn causes a flow of potassium and calcium ions to the endolymph. Thus, a voltage drop is developed on the basal part of the OHC membrane .
An outer hair cell model is described by an equivalent electrical circuit in Figure 2 [6, 25]. The apical part is presented by its variable conductance () and its constant capacitance (), while the basal part is presented by its constant conductance and capacitance, and , respectively. The electrical potential of the endolymph is , and the perilymph resting potential is . Solving the equivalent electrical circuit by using Kirchhoff laws  yields the differential equation for , the OHC’s membrane voltage:
where , which represents the cutoff frequency of the OHC’s membrane and (see Table 1).
An OHC’s length changes due to the electrical potential developed on the OHC membrane and is defined as . It is usually described as a sigmoid function [26–28]:
where and are constants (see Table 1).
The pressure developed by each OHC () is obtained from the spring properties of the OHC . Let’s define as the OHC effective index. It represents the effective distribution of the OHCs along the cochlear partition. Therefore, the OHC pressure is obtained by
where is the cochlear length, is the oval window displacement, and is the coupling factor of the oval window to the perilymph. In order to obtain , the middle ear model was applied  as expressed by the following differential equation:
where is the oval window areal density, is the oval window resistance, and is the oval window resonance frequency. The mechanical gain of the ossicles is denoted by (see Table 1). is the input acoustic stimulus.
The initial conditions are
|0.5||Cross-sectional area of the cochlea scalae|
|0.003||Width of the basilar membrane [cm]|
|3.5||Cochlear length [cm]|
|Basilar membrane stiffness per unit area|
|Basilar membrane damping per unit area|
|Basilar membrane mass per unit area|
|Tectorial membrane stiffness per unit area|
|Tectorial membrane damping per unit area|
|0||Tectorial membrane mass per unit area|
|Outer hair cell membrane’s stiffness|
|Peak to peak electromotility displacement|
|Reference electromotility voltage [V]|
|Outer hair cell cutoff frequency|
|Perilymph resting potential [V]|
|Oval window cutoff frequency [Hz]|
|0.5||Oval window aerial density|
|Oval window resistance|
|Coupling of oval window to perilymph [none]|
|21.4||Mechanical gain of ossicles [none]|
|1||IHC AC coupling [V/s/cm]|
|100||IHC DC coupling [V/cm]|
|IHC integration time [s]|
|1||AN coupling [spikes/s/V]|
|60||High spontaneous rate [spikes/s]|
|3||Medium spontaneous rate [spikes/s]|
|0.1||Low spontaneous rate [spikes/s]|
|500||Saturation rate [spikes/s]|
|70||Effective level threshold for high spontaneous rate [dB]|
|50||Effective level threshold for medium spontaneous rate [dB]|
|30||Effective level threshold for low spontaneous rate [dB]|
2.1. Simulation results: The effect of outer hair cells loss
The above cochlear model was solved in the time domain by implementing a parallel algorithm on a commodity graphics processor unit (GPU) .The output of the model is the BM velocity as a response to an acoustic stimulus
Figure 3 represents the basilar membrane velocity relative to the input level at two points along the cochlear partition. The response was obtained by applying the model for a set of simple tones with a frequency of at different levels . The gain plotted in Figure 3 was derived by , where from the stapes (Figure 3A) and from the stapes (Figure 3B). Each solid line was obtained from a different level for a normal cochlea The broken line represents an abnormal cochlea with 100% OHC loss, which was derived by the model by substituting For the normal cochlea, the maximum sensitivity at from the stapes (Figure 3A) was obtained when the stimulus was at 4 kHz and 0 dB SPL. The sensitivity is reduced with the increase in the input level, and the maximum sensitivity was shifted to a lower frequency (about 1 kHz). These results are in agreement with experimental results . Figure 3B represents a characteristic frequency of 1 kHz that yielded wider responses as a function of frequency for all input levels. However, the gain of the damaged cochlea (broken line in Figure 3) was independent of the input level at both locations. When substituting in the cochlear model’s equations, the nonlinear terms are zeroed and the model becomes linear.
Figure 5 represents the relative BM velocity obtained by the model when the Hebrew word “SHEN” was introduced. The input word is presented in Figure 4 as a function of time (upper panel) and by its spectrogram (lower panel).
The absolute BM velocity in dB is presented in a color-coded two-dimensional image, whose
The BM velocity in response to the consonant “sh” is very different in the four images in Figure 5. The maximum response was shifted toward the stapes when the amplitude was increased in the normal cochlea. In response to the high level stimuli, the maximum BM velocity obtained was closer to the stapes in the damaged cochlea than in the normal one.
3. Model of the Inner hair cell—auditory nerve synapse
The basilar membrane motion is transformed into neural spikes of the auditory nerve by the inner hair cells. The deflection of the hair-cell stereocilia opens mechanically gated ion channels that allow any small, positively charged ions (primarily potassium and calcium) to enter the cell . Unlike many other electrically active cells, the hair cell itself does not fire an action potential. Instead, the influx of positive ions from the endolymph in the scala media depolarizes the cell, resulting in a receptor potential. This receptor potential opens voltage-gated calcium channels; calcium ions then enter the cell and trigger the release of neurotransmitters at the basal end of the cell. The neurotransmitters diffuse across the narrow space between the hair cell and a nerve terminal, where they then bind to receptors and thus trigger action potentials in the nerve. In this way, the mechanical sound signal is converted into an electrical nerve signal. The IHCs chronically leak Ca+2. This leakage causes a tonic release of neurotransmitter to the synapses. It is thought that this tonic release is what allows the hair cells to respond so quickly to mechanical stimuli. The quickness of the hair cell response may also be due to that fact that it can increase the amount of neurotransmitter release in response to a change as little as 100 μV in membrane potential.
Many models were developed for explaining the IHC’s transduction abilities [16, 32, 33]. Some models focused on possible mechanisms for adaptation [17, 34–36]. Others were concerned with the biophysics of hair cells [37, 38] or the mechanoelectric transduction process .
One commonly simplified modeling approach to explain the IHC’s role in the auditory system posits a nonlinear system that combines AC and DC responses followed by a random generator that creates spike trains [7, 16, 17, 40]. The model presented in this chapter is consistent with these principles.
The BM displacement stimulates the IHC cilia to move, its velocity corresponding to the BM velocity () by a nonlinear function, e.g.,
Since the BM displacement in this model is nonlinear as described by the mechanical model above, we ignore the nonlinear terms in Eq. (11) and assume that ; therefore, .
The mechanoelectrical receptors that are located in the IHC membrane yield an increase in the electrical potential () of the IHC membrane. A common modeling approach for the IHC’s role in the auditory system is based on a nonlinear system that combines AC and DC responses [7, 40]. The DC level represents the firing responses without any synchrony to the input stimuli and the AC level represents the synchronized firing response (typical at low frequencies). The DC component includes a high-pass filter followed by a moving average filter of 2 ms long; the AC component consists of a low-pass filter. In order to account for physiological observations that demonstrated a reduction in synchronization as the frequency of the stimulus increases, we chose a low-pass filter with a cutoff frequency of 1000 Hz, with a slope of 30 dB/decade. In practice, is obtained by
where represents the location of the IHC along the cochlear partition, is the impulse response of the low-pass filter that represents the IHC response, and , , and are constants (see Table 1). The parameter represents the IHC efficiency index. It was defined as a function of
This IHC receptor potential opens voltage-gated calcium channels; calcium ions then enter the cell and trigger the release of neurotransmitters at the basal end of the cell. The neurotransmitters diffuse across the narrow space between the hair cell and a nerve terminal where they then bind to receptors and thus trigger action potentials in the nerve.
The neural activity in the auditory system is irregular since a specific neuron might respond with a single spike or several spikes to a given stimuli . The origin of the stochastic activity of neurons is poorly understood. This activity results in both intrinsic noise sources that generate stochastic behavior on the level of the neuronal dynamics and extrinsic sources that arise from network effects and synaptic transmission . Another source of noise that is specific to neurons arises from the finite number of ion channels in a neuronal membrane patch [31, 44].
There are a number of different ways that have emerged to describe the stochastic properties of neural activity. One possible approach relates to the train of spikes as a stochastic point process. For example, in their earlier studies, Alaoglu and Smith  and Rodieck et al.  suggested that the spontaneous activity of the cochlear nucleus can be described as a homogeneous Poisson process. Further investigations of the auditory system described the neural response as a nonhomogeneous Poisson point process (NHPP) whose instantaneous rate depends on the input stimuli [47, 48].
In the present chapter, we relate to the neural activity as NHHP, and thus only the instantaneous rate (IR) should be extracted. In order to derive IR, we use the Weber–Fechner law, which describes the relationship between the magnitude of a physical stimulus and the intensity or strength that people feel. This kind of relationship can be described by a differential equation:
where is the differential change in perception, is the differential increase in the stimulus, and is the stimulus at the instant. Integrating the above equation reveals . Let us define as the IR obtained by the auditory fiber attached to location
where is the step function and is a constant (see Table 1).
In general, the auditory nerve response is divided into three types of fibers according to their spontaneous rates: a high spontaneous rate (HSR) that usually codes low-level stimuli, a medium spontaneous rate (MSR), and a high spontaneous rate (LSR) that generally codes high level stimuli. In order to include all types of auditory nerves, we substitute in Eq. (13) the relevant constants for the HSR, MSR, and LSR that yield the instantaneous rates , respectively. The different types of ANs are distributed uniformly along the cochlear partition, where the most frequent fibers are those with a low spontaneous rate (about 60%).
The IRs (spikes per second) for the LSR fibers, , as a response to the Hebrew word “SHEN” are exhibited in Figure 6 by color-coded images as a function of time (
The upper-left image in Figure 6 represents a normal cochlea (; ). The upper-right image corresponds to a cochlea with intact OHC but with 25% IHC loss (). A clear reduction in the instantaneous rate is shown. The maximum instantaneous rate was reduced from 160 spikes/s in the normal cochlea to 100 in the damaged one. Moreover, in the damaged cochlea, about 25% more instances (time and location along the cochlear partition) reached the spontaneous rate 0.1 spikes/s relative to the normal cochlea.
The two lower images in Figure 6 represent cochleae with 98% OHC loss (). The BM response was changed as Figure 5 shows. Thus, the reduction in the instantaneous rate corresponds entirely to the decrease in BM velocity when the cochlea has intact IHCs (lower-left image). For a cochlea with both OHC and IHC loss (lower-right image), the instantaneous rate was reduced because of both losses. The response to the high frequencies that correspond to the syllable “SH” almost vanished.
4. Threshold estimation based on the auditory nerve
The hearing threshold, defined as the lowest threshold of acoustic pressure sensation, is usually determined by quantitative psychoacoustical experiments in which the human ability to detect the smallest difference in the stimulus’ physical property is obtained. This difference is referred to as a just-noticeable difference (JND). In such experiments, a subject must distinguish between two close time (
Comparing the behavioral JND and the neural activity is possible if one assumes that the neural system estimates the measured parameters. Siebert  obtained such a comparison when the JND of a single tone’s frequency and level was compared to the neural activity of the auditory nerve. Siebert’s findings were based on the assumption that the auditory nerve (AN) response behaves as an NHPP, and the brain acts as an unbiased optimal estimator of the physical parameters. Thus, the JND is equal to the standard deviation of the estimated parameter and can be derived by lower bounds such as the Cramer–Rao lower bound. Heinz et al. ) generalized Sibert’s results to a larger range of frequencies and levels.
In a psychoacoustical JND experiment, the yielded JND value is obtained when , which is expressed by:
where , is the true value of , and is the estimated value of . Therefore, , yields the relations , which implies
When the estimation is based on neural activity that behaves as NHPP, there are two possible ways to analyze the performance. The first way is referred to as “rate coding” (RA), which means that the performance is analyzed on the basis of the number of spikes. The second way is referred as “all information coding” (AI), indicating that in addition to the number of spikes in the interval, the timing of the discharge spikes is considered as well.
Let us define as the random variable that represents the number of spikes in the time interval . For the RA coding, the probability density function (pdf) of getting
where is the instantaneous rate of the nerve fiber that depends on both the time and the physical parameter . Given the RA pdf (Eq. (16)), the resulting Cramer–Rao lower bound (CRLB) is obtained by 
where is the average rate.
For the AI coding, the probability density function of getting successive neural spikes at a set of time instances is , where is obtained by
The resulting CRLB was derived by Bar David , which yields
In every unbiased system, the following relations hold:
In an optimal unbiased system, the standard deviation of the estimator can achieve the lower bounds. Since (Eq. 15), can be estimated by calculating or . Comparing the estimated thresholds to experimental results can resolve the question whether the brain estimates the auditory thresholds according to RA or AI coding.
In order to apply the above-mentioned method for determining the auditory threshold, we should consider the responses of all 30,000 AN fibers that innervate each ear. Since the AN fibers are statistically independent , the
where is the standard deviation of the estimator obtained by the
where is the CRLB of the
Let us define the number of fibers attached to each location along the cochlear partition as . Thus, , where is the cochlear length. For every location, three IRs were derived (Eq. 13), which correspond to the HSR, MSR, and LSR fibers, respectively. They are distributed uniformly along the cochlear partition with corresponding weights (see Table 1). Therefore,
4.1. Simulation results: rate or all information?
In order to calculate both and , the derivative of the instantaneous rate should be derived. We have used the following approximation:
Therefore, in deriving for any stimulus , the IRs for both stimuli and should be calculated. Two types of thresholds will be presented for tones in quiet and in the presence of noise. The quiet threshold was derived by substituting that yielded . For the thresholds in the presence of noise, is equal to the noise, and is equal to the noise +tone with a level of .
We have calculated the amplitude thresholds as a function of frequency while using both types of coding, RA and AI. The derived thresholds are shown in Figure 7 along with normal equal-loudness-level contour at threshold (ISO 226:2003) . The rate coding successfully predicts the ISO 226 standard while the AI coding yielded performances that are better by a few decibels. This difference was not sufficient for deciding what type of coding is used by the brain in order to determine the absolute thresholds. Deriving the thresholds in the presence of noise revealed a more significant difference between the two types of coding.
In order to present the threshold of tones in the presence of noise, the smallest perceivable difference is presented in terms of difference limen (DL), which are defined as
where corresponds to the noise level in Volts and is the derived
4.2. Simulation results: Abnormal ears
Audiograms of the hearing impaired were estimated by subtracting the threshold of the damaged ear from the threshold defined by the equal loudness at threshold . The estimated audiograms of different types of pathologies are shown in Figure 9. In all the estimated audiograms, we assumed that both IHC and OHC loss were uniform along the cochlear partition, which implies that and
Three audiograms are exhibited in panel A of Figure 9. They were obtained with (the normal value) and three values of that represent of OHC loss, respectively. Due to OHC loss of 50%, no hearing loss was obtained up to 2 kHz. With 100% OHC loss, the estimated audiogram revealed a maximum hearing loss of about 60 dB at 6 kHz. Panel B of Figure 9 represents cochleae with no OHC loss () but with different degrees of IHC loss, , which represents of IHC efficiency. Reduction in IHC efficiency caused a maximum hearing loss at 1000–2000 Hz. A combination of IHC and OHC loss is probably a more common pathology; an example of its effect is shown in Figure 9C. It represents cochleae with 75% OHC loss () and different degrees of IHC loss. The maximum hearing loss was obtained at 4 kHz. The estimated audiogram with resembles a typical mild audiogram while the one with resembles a typical severe audiogram.
The effect of background noise on the threshold to tones is demonstrated in Figure 10, where DL is plotted as a function of noise level for different frequencies. As a result of OHC loss, and a significant increase in DL was yielded especially at high frequencies relative to normal cochlea. The combination of IHC and OHC loss caused an increase in DL at all frequencies. It seems that the effect of IHC loss causes an increase in DL at low frequencies below 1000 Hz. This result might explain the difficulties of people with mild hearing loss to understand speech in a noisy background. The information of speech sounds is mainly included in the low frequency range.
In this study, a comprehensive model for the auditory system was introduced. It included a detailed, nonlinear time domain cochlear model with active outer hair cells that are driven by the tectorial membrane motion. Outer hair cell loss was indicated by an OHC efficiency index that could change along the cochlear partition. The second part of the model included a synaptic model that generates the auditory nerve’s instantaneous rate as a response to basilar membrane motion and is affected by inner hair cell transduction efficiency. Since both inner and outer hair cell loss can be easily integrated in the model, the model is useful for demonstrating those pathologies.
In order to compare normal and abnormal human abilities to the model predictions, a comprehensive technique was introduced. It was based on the assumption that the brain behaves as an optimal processor and its task in JND experiments is to estimate physical parameters. The performance of the optimal processor can be derived by calculating its lower bound. Since the neural activity was described as an NHPP, the Cramer–Rao lower bound was analytically derived for both rate and all information coding.
In this study, we have shown that the amplitude of tones in quiet and in the presence of background noise is most likely coded by the rate only. Pathological audiograms can be predicted by introducing reduced OHC and IHC efficiency indices. Moreover, the presence of noise causes a significant increase in DL. The effect of DL as a function of frequency depends on the type of hearing loss. In general, OHC loss mostly effects the high frequencies, while the effect of IHC loss is mostly expressed in the low frequencies.
The model presented in this paper can be used as a framework to explore different types of pathologies on the basis of audiograms obtained in quiet and in the presence of background noise.
This research was partially supported by the Israel Science Foundation grant no. 563/12. I want to express my great appreciation to my students who participated in this research over the years: Ram Krips, Azi Cohen, Vered Weiss, Noam Elbaum, Oren Cohen, Dan Mekrantz, Oded Barzely, Yaniv Halmut, and Tal Kalp.