Head-related transfer functions (HRTFs) describe the spatial filtering of acoustic signals by a listener’s anatomy. With the increase of computational power, HRTFs are nowadays more and more used for the spatialised headphone playback of 3D sounds, thus enabling personalised binaural audio playback. HRTFs are traditionally measured acoustically and various measurement systems have been set up worldwide. Despite the trend to develop more user-friendly systems and as an alternative to the most expensive and rather elaborate measurements, HRTFs can also be numerically calculated, provided an accurate representation of the 3D geometry of head and ears exists. While under optimal conditions, it is possible to generate said 3D geometries even from 2D photos of a listener, the geometry acquisition is still a subject of research. In this chapter, we review the requirements and state-of-the-art methods for obtaining personalised HRTFs, focusing on the recent advances in numerical HRTF calculation.
- head-related transfer functions
- spatial hearing
- acoustic measurement
- numerical calculation
Head-related transfer functions (HRTFs) describe the filtering of the acoustic field produced by a sound source arriving at the listener’s ear. The filtering is the effect of the interaction of the sound field with the listener’s anatomy and has various properties. First, the incoming sound wave arrives at the ipsilateral pinna, i.e., the ear closer to the sound source, and then at the contralateral ear, i.e., the ear away from the sound source. This time difference between ipsilateral and contralateral ear is usually described as the interaural time difference (ITD). Second, larger anatomical structures, i.e., torso, shoulders and head, affect frequencies up to 3 kHz in a comparatively trivial way. As the listener’s torso and head shadow the sound wave arriving at the contralateral ear, interaural level differences (ILDs) arise. Third, the incoming sound is filtered in a complex way by the shape of the listener’s pinnae. These monaural time-frequency-filtering effects become especially important for higher frequency regions (above approximately 4 kHz) and sound directions inducing the same ITDs and ILDs [1, 2, 3, 4, 5, 6]. Humans have learned to interpret this acoustic filtering to span an auditory space as an internal model of their natural environment . Because the pinna shape is unique for every person, HRTFs are considered listener-specific [8, 9, 10], similar to a fingerprint [1, 2, 3, 4, 5, 6]. With an individually fitted HRTF dataset, it is possible for a person to perceive sounds (in a virtual environment) via headphones as if the sounds would originate from their (physical) position around the listener.
Both interaural and monaural features for a single sound direction can be represented by a binaural HRTF pair . In signal processing terms, a binaural HRTF pair can be described as
where and describe the sound pressure at a position inside the left and right ear, respectively (typically the entrance of the left and right ear canal or a position close to the eardrum), describes the sound-source position (i.e., distance and direction), describes the frequency and the listener’s geometry, emphasising the listener-specificity of HRTFs. describes the reference sound pressure, which is usually the pressure measured at the position of the midpoint of right and left ear
There are several options to set a specific coordinate system to systematically describe directions for HRTFs. From the physical perspective, the
The understanding of these coordinate systems is important because state-of-the-art acquisitions and representations of HRTFs utilise those systems. For example, Figure 2 shows HRTFs along the Frankfurt and the median plane. These various coordinate systems are used in HRTF visualisation, in various HRTF-related software packages such as the SOFA toolbox , and in auditory modelling, e.g., the Auditory Modelling Toolbox (AMT) [16, 17].
HRTF acquisition can be classified into three categories: acoustic measurement, numerical calculation, and personalisation .
The acoustic measurement is traditionally designed as the measurement of the impulse response between source and receiver in an anechoic or semianechoic chamber, describing the transmission path from a sound source to the ear [11, 19]. A comprehensive review of the established state-of-the-art acoustic techniques to measure HRTFs can be found in . Thus, in this chapter, Section 3, we only briefly provide an overview of the traditional acoustic HRTF measurement approaches, highlight some of their differences and new trends and focus on the requirements for the acoustic measurement.
Numerical HRTF calculation simulates the acoustic measurement by considering a 3D representation of the listener’s geometry and the positions of multiple external sound sources, for which the generated sound pressure at the entrance of the ear canal is calculated. This technique has become more popular and is the main focus of this chapter. To this end, in Section 4, we provide an overview of the principles of various numerical calculation approaches including a comparison of the mentioned methods.
Personalisation of HRTFs describes the process of adapting an existing set of generic data guided by listener-specific information, either with the help of objective or subjective personalisation method. The objective personalisation has been approached from two different domains: the geometric domain, in which listener-specific anthropometric data are measured and used to personalise a generic geometric model from which HRTFs are then simulated; or the spectral domain, in which a generic HRTF set is directly personalised based on listener-specific information. Examples for personalisation approaches include utilising frequency scaling , parametric modelling of peaks and notches , active shape modelling (ASM) , principal component analysis (PCA) in both geometric  and spectral domains [25, 26, 27, 28, 29], multiple regression analysis , independent component analysis (ICA) , large deformation diffeomorphic metric mapping (LDDMM) [25, 32], local neighbourhood mapping , neural networks [34, 35, 36, 37, 38, 39, 40, 41] and linear combination of HRTFs . Despite many efforts worldwide [43, 44, 45, 46], the link between the morphology and HRTFs is not fully understood yet, mostly because of the high dimensionality of the problem. Most recent tools for studying that link are rooted in aligning high-resolution pinna representations to target representations facilitated with parametric pinna models [47, 48].
In the subjective personalisation, listeners are confronted with several sets of HRTFs and an algorithm (usually based on the evaluation of localisation errors, i.e., the difference between perceived and actual sound-source location) adapts the HRTF sets aiming at converging at listener-specific HRTFs [9, 49]. For an educated guess for the initial sets, anthropometric data can be used to pre-scale the HRTF sets, or the HRTF sets can be pre-selected via psychoacoustic models . Clustering of the HRTF sets can further improve the relevance and reduce the duration of the personalisation procedure [49, 51].
All these methods aim at providing a specific quality in terms of acoustic and psychoacoustic properties. In the following section, we describe the acoustic properties and psychoacoustic requirements for human HRTFs, both of which lay the base for HRTF acquisition. Then, we briefly describe the most important requirements for the acoustic HRTF measurement, complementing the work of Li and Peissig . Finally, we describe approaches for numeric HRTF calculation in greater detail.
2. Head-related transfer functions: acoustic properties and psychoacoustic requirements
In this section, we describe the acoustic properties of HRTFs and relate them to psychophysical properties of human hearing with the goal to derive the minimum requirements for sufficiently accurate HRTF acquisition by means of perception. We analyse spectral, temporal and spatial aspects of HRTFs and consider contributions of distinct parts of the human body to these aspects.
Humans can hear frequencies roughly between 20 Hz and 20 kHz, with frequencies at the lower end being perceived as vibrations or creaks, and with the upper end decreasing with age and duration of noise exposure . From the psychoacoustic perspective, frequencies down to 90 Hz contribute to sound lateralisation, i.e., localisation on the interaural axis within the head , and up to 16 kHz to sound localisation, i.e., localisation outside the head , defining the smallest frequency range for the HRTF acquisition. Figure 2 shows the amplitude spectra of a binaural HRTF pair of two listeners. For each listener, the left and right columns show HRTFs of the left and right ear, respectively. The top row shows the HRTFs along the median, i.e., for the lateral angle of zero, from the front, via up, to the back. The bottom row shows the HRTFs along the Frankfurt plane, i.e., the horizontal plane located at the eye level. Figure 2 demonstrates that HRTFs vary across ears, frequency, sound-source positions and listeners. The bottom panels emphasise the difference between ipsilateral and contralateral ear, showing the dynamic range, especially for frequencies higher than 6 kHz.
Assuming the propagation medium is air and a sonic speed of 340 m/s, the human hearing frequency range translates to wavelengths approximately between 1.7 cm and 17 m, resulting in different body parts affecting HRTFs in different frequency regions. The reflections of the torso create spatial-frequency modulations in the range of up to 3 kHz . This effect can be observed in the top row of Figure 2, in the form of elevation-dependent spectral modulations along the median plane [55, 56]. Another contribution comes from the head, which shadows frequencies above 1 kHz. This effect can be observed in both rows of Figure 2, with large changes in the spectra beginning at around 1 kHz . A large contribution is that of the pinna: The resonances and reflections within the pinna geometry create spectral peaks and notches, respectively, in frequencies above 4 kHz . This effect can be observed in the bottom row of Figure 2.
From the perceptual perspective, the quality of these HRTF spectral profiles is important in many processes involved in spatial hearing. For example, sound-localisation performance deteriorates when these spectral profiles are disturbed by means of introducing spectral ripples , reducing the number of frequency channels  or spectral smoothing . From the acoustic perspective, these spectral profiles show modulation depths of up to 50 dB , defining the required dynamic range in the process of HRTF acquisition.
The temporal aspects of HRTF acquisition are shown in Figure 3 as the head-related impulse responses (HRIRs), i.e., HRTFs in the time domain, of the same listeners as in Figure 2. There are a few things to consider. First, the minimum length of the measurement is bounded by the length of the HRIRs. Their amplitude decays within the first 5 ms, setting the requirement for the room impulse response during the measurements . After the 5 ms, the HRIRs decay below 50 dB, setting the requirement on the broadband signal-to-noise ratio (SNR) of the measurements. Further, because of the human sensitivity to interaural disparities, HRTF acquisition also requires an interaural temporal synchronisation. While sound sources placed in the median plane cause an ITD of zero (theoretically, reached only for identical path lengths to the two ears), just small deviations from the median plane cause potentially perceivable non-zero ITDs. Human listeners can detect ITDs being as small as 10 μs [53, 62], defining the interaural temporal precision required in the HRTF acquisition process. The ITD increases with the lateral angle of the sound source, reaching its extreme values for sources placed near the interaural axis [63, 64]. The largest ITD depends on the distance between the listener’s two ears, mostly being defined by the listener’s head width and depth , reaching ITDs of up to ±800 μs. That ITD range translates to the sound’s time of arrival (TOA) at an ear varying in the range of 1.6 ms, which needs to be considered in HRTF measurement by providing sufficient temporal space in the resulting impulse response.
HRTFs are continuous functions in space, even though, they are traditionally acquired for a finite set of spatial positions. From the
HRTFs are listener-specific, i.e., they vary among the listeners . The reasons for that inter-individual variation are usually rooted in listener-specific morphology of the head and ears. For example, the variation in the head width of approximately ±2 cm across the population causes variation in the largest ITD in the range of ±80 μs . Figure 4 shows HRTF-relevant parts of the human body, where Figure 4a shows rough measures of the body and Figure 4b shows areas of the pinna responsible for the distinct spectral features in higher frequencies. The width and depth of head and torso have a large effect on HRTFs in the lower frequencies. The inter-individual variation in the pinnae geometry causes variations in HRTFs in frequencies above 4 kHz, with listener-specific differences of up to 20 dB . The inter-individual variation in the HRTFs is rather complex because the pinna is a complex biological structure—small variations in geometry (in the range of millimetres) may cause drastic changes in HRTFs  along the vertical planes in high frequencies , see Figure 2. However, not all pinna regions affect HRTFs equally . Basically, the convex curvatures of the pinnae contribute to focusing the incoming sound waves towards the entry of the ear canals, comparable to a satellite dish. Figure 4b shows the anatomical areas important for localisation of sounds [48, 56, 89, 92, 93]. Currently, the description of the pinna geometry is not a trivial task. Pinnae have been described by means of anthropometric data stored in various data collections, e.g., [67, 69, 88, 94, 95, 96]. While the parameters used in these data collections do not seem to completely describe a pinna geometry from scratch, recent efforts aim at parametric pinna models able to generate non-pathological pinna geometries for arbitrary listeners [47, 48]. Such models describe the pinna geometry by means of various control points placed on the surface of a template pinna geometry. Figure 5 shows two examples of the implementation of such models. In Figure 5a, the pinna geometry is parametrised with the help of Beziér curves, i.e., polynomials within a spatial boundary . Figure 5b shows a different approach; here, the parameterisation of the pinna is utilised with control points that move proximal local areas . These parametric pinna models represent a step towards understanding the link between HRTFs and specific anatomical regions of the pinnae, and provide potential to synthesise large datasets of pinnae, e.g., in order to provide data for machine-learning algorithms.
In addition to the geometry, skin and hair may have an impact on HRTFs [97, 98] because of their direction-dependent absorption of the acoustic energy, especially at high frequencies. However, recent studies have shown that hair does not influence the localisation performance, but rather the perception of timbre instead [95, 99, 100, 101].
3. Acoustic measurement
The principle of an acoustic HRTF measurement relies on the system identification of the HRTF considered as a linear and time-invariant system. Here, an HRTF describes the propagation path between a microphone and a loudspeaker. Because of the binaural synchronisation, HRTFs are measured simultaneously at the two ears. The measurements are commonly performed for many source positions because of the required high spatial resolution. Recently, the details of the acoustic measurements, including a comprehensive list of HRTF measurement sites has been reviewed . Thus, we only briefly introduce the basics and focus on the most recent advances in the acoustic HRTF measurement.
Typically, two omnidirectional microphones are placed in both ear canals, and the loudspeakers are arranged around the listener, ideally, with the number of loudspeakers corresponding to the number of HRTF positions to be measured. Figure 6 shows two examples of measurement setups of various complexity: In Figure 6a, the listener is located on a turntable and moves within a fixed near-complete circular loudspeaker array. Figure 6b shows a similar approach with a near-complete spherical loudspeaker array, and Figure 6c shows the placement of a microphone in the ear canal so that it is membrane lines up with the entrance of the ear canal. Actually, it does not matter whether the microphones or loudspeakers are placed in the ear canal—this approach of ‘reciprocity’ is usually facilitated in numeric HRTF calculations (Section 4.4). However, setups with loudspeakers in the ears  lack signal-to-noise ratio (SNR) as the amplitude of the source signal needs to be low enough to not harm the listener, making the setup impractical for experiments. With the microphones in the ears, the most simple setups consist of a single loudspeaker moved around the listener . Unfortunately, such setups lead to a long measurement duration for a dense set of HRTF positions. With the increasing availability of multichannel sound interfaces and adequate electroacoustic equipment, over the decades, the number of actually used loudspeakers increased. Setups with only a single loudspeaker moving around the listener have been replaced by setups with loudspeaker arcs surrounding the listener. In those setups, the listener sits on a turntable and either the listener (e.g., Figure 6a) or the loudspeaker arc is rotated [88, 104].
Recent approaches follow one of two different directions; On the one hand, generic and individual HRTFs are measured with a growing number of loudspeakers used in specialised facilities [67, 95]. Some even with such a large amount of loudspeakers that the listener is rotated for a few discrete positions, and post-processing algorithms interpolate between HRTF directions, e.g., the setup in Figure 6b. On the other hand, user-friendly individual HRTF measurement approaches are suggested, showing a trend towards decreasing the complexity of the measurement setup and using widely available equipment. In these approaches, only a single speaker is used and the listener is asked to move the head until a dense setup of HRTF directions can be obtained. These measurements enable simple systems to be used at home [105, 106], in which a head-tracking system records the listener’s head movements in real time and adapts the measured spatial HRTF grid. Head-above-torso orientations have to be considered additionally , but they reduce the complexity of the measurement setup and enable using widely available equipment, e.g., a commercially available VR headset and one arbitrary loudspeaker, in regular rooms, thus increasing the user-friendliness for setups .
Most of those recent approaches consider spatially discrete positions of the listener and/or the loudspeakers. In order to tackle the trade-off between high spatial resolution and long measurement duration, other recent advances have been made towards spatially continuous measurement approaches [107, 108, 109]. These approaches enable the measuring of all directions around the listener for a single elevation within less than 4 minutes . Certainly, an advantage of such an approach is the access to the spatially continuous information, which is important especially for frontal HRTF directions. With more and more silent turntables and swivelled chairs, achieving a high SNR is not a big issue. Most recent approaches related to the spatially continuous measurement utilise Kalman filters to acquire system parameters representing HRTFs, and thus speed up the HRTF measurement in a multi-channel setup . Compared to spatially discrete approaches, the spatially continuous method can achieve accuracy within a spectral error of 2 dB .
The requirements of the room are not rigorous: In principle, the measurement room does not have to be perfectly anechoic, but it has to fulfil some requirements regarding size and reverberation time. Room modes may exist below 500 Hz as they can be neglected in that frequency range . Acceptable measurement results can be obtained as long as the first room reflection arises after 5 ms such that the measured room impulse responses can be truncated without truncating the HRIRs. Medium and large surfaces, i.e., the mount of the loudspeakers, the loudspeaker arc, the turntable, listener seat, etc., can potentially cause acoustical reflections overlapping with the direct sound path within the first 5 ms of the HRTF. These reflections are usually damped, e.g., by covering the speakers in absorption material. Before the measurement, the listener’s head has to be aligned in the measurement setup, adjusting the ears to the interaural axis of the system and the head to the Frankfurt plane. This alignment can be supported by, e.g., a laser system. The orientation and position of the listener’s head should be monitored throughout the measurement procedure in order to detect listener’s unwanted movements or position drifts. This helps when having to repeat potentially corrupted measurements.
The loudspeakers used for the measurements need to show a fast impulse response decay; fast enough to not interfere with the temporal characteristics of the HRTFs. This can be achieved by using loudspeaker drivers with light membranes, simple electric processing and no acoustic feedback such as a bass-reflex system. The acoustic short-circuit usually limits the lower frequency range of the loudspeakers, and multidriver systems are a common solution to that problem. In order to achieve a spatially compact acoustic source in a multidriver system, it is common to use coaxial loudspeaker drivers with an omnidirectional directivity pattern in HRTF measurement systems .
The placement of the microphones can also be an issue. Early setups used an open ear canal where the microphones were positioned close to the eardrum . However, the effect of the ear canal does not seem to be direction-dependent, and its consideration in the measurement introduces technical difficulties and a large measurement variance [19, 113, 114]. Nowadays, the microphones are usually placed at the entrance of the ear canal which is acoustically blocked [11, 20]. Blocking the ear canal can be achieved by using microphones enclosed in earplugs made from foam or silicone or by wrapping the microphone in skin-friendly tape before inserting it. Note that such a measurement captures all directional-dependent features of the acoustic filtering by the outer ear, however, the directional-independent filtering by the ear canal is not captured. All cables from the microphone have to be flexible enough to minimise their effect on the acoustics within the pinna—one way is to lead the cable through the incisura intertragica and secure it with tape on the cheek, see Figure 6c.
In general, system identification can be performed with a variety of excitation signals. While previously Golay codes or other broadband signals have been used , more recently, the multiple exponential sweep method (MESM)  has been established and further improved , enabling fast HRTF measurement at high SNRs, reducing the discomfort for the listener. Still because of the imperfections in the electro-acoustic setup, a reference measurement is required to estimate the basis of the measurement without the effect of the listener, i.e., to estimate . It is typically done for each microphone by placing the microphone in the centre of the measurement setup and recording the loudspeaker-microphone impulse response for all loudspeakers. The reference measurement can also be used to control the sound pressure level in order to avoid clipping at the microphones and analogue-digital converters. This can especially happen when each loudspeaker is driven within its linear range, but the overlapping signals from multiple loudspeakers raise the total level to ranges beyond the linear range of the recording system. Additionally, because of the HRTF’s resonances, the level during the actual HRTF measurements can be up to 20 dB higher than that during the reference measurement, translating to a requirement for the headroom of at least 30 dB at the reference measurement. The maximum level is not only limited by the equipment; the listener’s hearing range also needs to be considered, i.e., the maximum sound pressure level must neither create discomfort for the listener, nor go beyond the levels of safe listening. There is no special requirement for the listener regarding their audible threshold, hearing range or the visual sense. However, a particular measurement equipment or a particular lab could have some restrictions on, e.g., the listener’s weight or height.
Figure 7 shows measurement grids of three exemplary setups and one measurement grid of a simulation setup. Figure 7a and b correspond to the measurement setups in Figure 6a and b. In these setups, not every loudspeaker plays a stimulus at every position around the listener. An extreme case is a loudspeaker positioned at elevation that needs to be only measured once. Figure 7c shows another setup with uniformly distributed measurement points, and Figure 7d shows a uniform sampling grid used in numerical calculation experiments .
The repeatability of the measurement is an important issue. Within a single laboratory, changes in the room conditions such as temperature and humidity, as well as changes in the setup such as the ageing of the equipment may compromise the repeatability of the HRTF measurement [11, 20]. When comparing the HRTFs measurement across the labs, differences in the setups play also a role. In inter-laboratory and inter-method HRTF measurement comparison obtained for the same artificial head, severe ITD variations of up to 200 have been found [63, 64].
Once the HRTFs have been measured for all source positions, post-processing needs to be done before the HRTFs are ready to be used. First, in order to account for acoustic artefacts caused by the measurement room, a frequency-dependent windowing function is usually applied truncating the HRIRs [100, 117, 118]. Second, the measured HRIRs are equalised by the impulse response obtained from the reference measurements, i.e., with the microphone placed at the centre of the coordinate system with the listener absent. This equalisation can be either free-field or diffuse-field. For the free-field equalisation, the reference measurement is required only for the frontal direction (0° azimuth, 0° elevation) , whereas for the diffuse-field equalisation, the reference measurement is the root mean square (RMS) impulse response of all directions , and the results are commonly denoted as directional transfer functions (DFT) . Third, in most common rooms and even in (semi)anechoic rooms, reflections (or room modes) cause artefacts below 400 Hz, confounding the free-field property of HRTFs. Additionally, most loudspeakers used in the measurement are not able to reproduce low frequencies with sufficient power. Since the listener’s anthropometry has a small effect on HRTFs in the low-frequency range, HRTFs can be extrapolated towards lower frequencies with a constant magnitude and linear phase [20, 117]. Further post-processing steps may include spectral smoothing to account for listener position inaccuracies [60, 120] or adding a fractional delay to account for temperature changes followed by onset changes of the time signals .
The availability of acoustical HRTF measurements was a big step towards personalised binaural audio and virtual reality experience. However, even a fast or continuous measurement method requires the listener to sit still for a few minutes [104, 110, 112] in a specialised lab facility. Recent advances have been made towards both large-scale high-resolution and small-scale at-home easy-to-use solutions, providing HRTF acquisition to a large audience. Still, the imperfections in the electro-acoustic equipment set drawbacks of the acoustic measurement. Here, recent advances in the numeric calculations of the HRTFs can provide an interesting alternative.
4. Numerical calculation of HRTFs
Generally, the calculation of HRTFs simulates the effects of the pinna, head and torso on the sound field at the eardrum. The goal is to numerically obtain the sound pressure at the two ears for a given set of frequencies and spatial positions. There are many methods to simulate wave propagation . When applied to the HRTF calculation, all of the methods require a geometric representation of head and pinnae as input. For an accurate set of HRTFs, an exact 3D representation of the geometry, especially that of the pinnae with all their crests and folds, is of utmost importance . The 3D geometry is represented using a discrete and finite set of elements, further denoted as ‘mesh’. A mesh is a representation of the region of interest (ROI), i.e., the object’s volume and surface, with the help of simple geometric elements. In most applications, the faces of these elements are assumed to be flat, which in turn explains the preference for triangular faces because they are always flat and therefore have one unique normal vector. This is not always the case for other shapes, e.g., quadrilaterals.
The requirements on the mesh have to consider geometrical as well as acoustical aspects. From the acoustic perspective, a typical rule of thumb for numerical calculation requires the average edge length (AEL) of elements to be at least a sixth of the smallest wavelength , which corresponds to an AEL of 3.5 mm for frequencies up to 16 kHz. However, in order to describe the pinna geometry sufficiently accurate, the average edge length (AEL) of the elements in the mesh needs to be around 1 mm, independently of the calculation method . Some numerical calculation algorithms are, in general, more efficient and stable if the geometries are represented locally with elements of similar sizes and as regular as possible, e.g., almost equilateral triangles. To this end, the mesh may undergo a so-called
Interestingly, only the pinna regions contributing to the HRTF (compare Figure 4b) require to be accurately represented  and the remainder of the geometry can be more roughly modelled. This applies especially to the head, torso and neck, which can be represented by larger elements. These anatomical parts can additionally be approximated by simple geometric shapes, e.g., a sphere for the head, a cylinder for the neck and a rectangular cuboid or an ellipsoid representing the torso , see e.g., Figure 4a. To emphasise the sophisticated direction dependency of the pinna, Figure 9 shows the calculated sound pressure distribution over the surface of the pinna. This simulation is calculated by defining one element in the centre of the ear canal as a sound source and evaluating the resulting sound pressure field at the vertices of the rest of the geometry; the procedure is explained thoroughly in Section 4.4.
The geometry can be captured via numerous approaches : a laser scan , medical imaging techniques such as magnetic resonance imaging (MRI) [69, 126] and computer tomography (CT) , or photogrammetric reconstruction . Laser, MRI and CT scans yield high-resolution meshes offering a small geometric error, but in turn, they need a special equipment. The laser scans are based on line-of-sight propagation and are able to measure short distances with an accuracy of up to 0.01 mm. The downside of line-of-sight propagation is that the manifolds of the pinnae are not easy to capture. In the medical imaging approaches, different downsides arise; acquiring the pinnae geometry via MRI is not a trivial process because they are flattened by the head support. This leads to two separate MRI measurements of each ear. The anatomy is then captured in ‘slices’ that can be stitched together in the postprocessing rather easily. The CT captures the anatomy in a similar way, but due to the high radiation exposure, such scans are usually not done with human subjects but with (silicone) mouldings of the listener’s ear. The overall procedure may take more time than an acoustic HRTF measurement and require the listener to either manufacture a moulding or meeting rather specific criteria for the scanning equipment (e.g., no tattoos, piercings, or implants). As an alternative, recent advances have been made for more widely applicable approaches such as photogrammetry [23, 128]. Photogrammetry is not only non-invasive but also can be done with widely available equipment, e.g., a smartphone or digital camera, without having the listener to travel to a specialised facility. In a nutshell, the photogrammetrical approach works as follows: a set of photographs from different directions is made for each ear [127, 129], the so-called
Simulations of acoustics require the information about the acoustic properties of the simulated objects. The HRTFs can be simulated with the 3D geometry represented as fully reflective, i.e., all surfaces having infinite acoustic impedance. With respect to localisation performance, only a small
In order to calculate HRTFs with sufficient spectral accuracy, the number of elements needs to be in the range of several tens of thousands, which might be important for the requirements of the computational power. Such large numerical problems usually require large amount of memory being in the range of Gigabytes. Nevertheless, the calculation time may reach a few days, especially when calculating HRTFs for many frequencies with high-resolution meshes. Note that if the used algorithm calculates HRTFs for each frequency independently, the calculations can be performed in parallel, and computer clusters can be used. This reduces the calculation time to a few hours for HRTFs the full hearing range and a mesh of several tens of thousands of elements.
All the algorithms for numerical HRTF calculation are based on the propagation of sound waves in the free field around a scattering object (also “scatterer”), usually described by the Helmholtz equation
where denotes the Laplace operator in 3D, denotes the (complex valued) sound pressure at a point for a given wavenumber in the domain around the scattering object and denotes the (complex-valued) contribution of an external sound source at the acoustic field around the object. The wavenumber is with being the frequency and the speed of sound.
In order to solve the Helmholtz equation for a given scatterer, boundary conditions are necessary. The
where denotes the normal vector at the surface pointing away from the object. For the boundary condition at infinite distance, the so-called
with showing that the right side grows much faster than the left side. This ensures that the sound field decays away from the object .
For the calculation of HRTFs, the Helmholtz equation can be solved numerically by means of various approaches, which are based on a discretisation of the exterior domain around the scatter . Some of these methods solve the Helmholtz equation in the frequency domain, and others solve its counterpart, the wave equation, in the time domain. In general, the formulations and the results in the different domains can be transformed into each other by using, e.g., the Fourier transformation. In the following, we describe the most prominent methods used for HRTF calculations.
4.1 The finite-element method
The finite-element method (FEM) solves the Helmholtz equation, Eq. (2), considering the scattering object or the domain around it as a volume . Figure 10 shows an example of a finite (domain) volume considered in the calculations with the scatterer with surface placed inside that volume. To simulate the acoustic field around that object, the weak form of the Helmholtz equation is used, i.e., the equation is multiplied by a set of known test functions and integrated over the whole domain, thus transforming the partial differential equation (e.g., the Helmholtz equation) into an integral equation, that can be easier solved numerically:
Secondly, the unknown pressure is approximated by a linear combination
of so-called ansatz functions . These ansatz functions, or element shape functions, help at interpolating between the discrete solutions for each point of the mesh. They are, in general, simple (real) polynomials defined locally on the elements of the mesh, e.g., having the value of 1 at their own coordinates and zero for other points of the mesh. Recent advances have been made towards adaptively finding higher-order polynomials and thus drastically reducing the computational effort [133, 134]. In theory, Eq. (3) should be fulfilled for all possible test functions , in practice, however, often the ansatz functions are also used as test functions, i.e., . With this choice, Eq. (3) can be transformed into a linear system of equations
and is the vector containing the unknown coefficients of the representation Eq. (4).
In general, the unknown coefficients represent the complex sound pressure at a given node of the mesh. The integrals involved are solved using numerical methods .
When calculating HRTFs, the space around the scatterer is assumed to be continuous and infinite; in practice, this space has to be discretised and truncated to a finite domain by inserting a virtual boundary. When applied to the calculation of HRTFs, a virtual boundary of the (now finite) domain needs to be defined and conditions have to be set to avoid any reflections from this boundary, thus keeping in line with anechoic or free-field conditions. There are several methods to do so, with the so-called perfectly matched layers (PMLs) being the most popular among the reviewed HRTF calculation approaches. The PML is created by inserting an artificial boundary inside , e.g., a sphere with sufficiently large radius, and artificially damped equations are used to represent a solution that can then be numerically calculated, fulfilling the Sommerfeld radiation condition. Recent advances have been made to define PMLs automatically by extruding the boundary layer of the mesh and obtaining the geometric parameters during the extrusion step .
The FEM has been widely used in HRTF calculations [137, 138, 139, 140, 141] and yields similar results to acoustical HRTF measurements with spectral magnitude errors of approximately 1 dB [137, 141]. The downside, however, is the need to model 3D volumes around the head, resulting in models of a high number of elements, having a strong impact on the calculation duration.
4.2 The finite-difference time-domain method
A similar approach as the FEM can also be followed in the time domain. By using a short sound burst in the time domain as an input signal, the HRTFs within a wide frequency range can be calculated at once. This approach is called the finite-difference time-domain (FDTD) method  and can be derived by solving the wave equation in the time domain
where and denote sound pressure fields in the time domain. The PML is applied to create the boundary conditions for outgoing sound waves. The evaluation grid is sampled evenly in cells across the domain with grid spacing , considering discrete time steps . A key parameter for numerical stability of the FDTD is the Courant number
defining the number of cells the sound propagates per time step. Typically, in order to obtain stable HRTF calculations, the Courant number is .
Figure 11 shows a 2D representation of a mesh used in the FDTD method. Note that because the mesh needs to consist of evenly spaced elements, most of the objects cannot be represented accurately and a sampling error is introduced at the boundary surface of the object. Additionally, as derivatives of functions are approximated by finite differences, the arithmetic operations are valid for infinite resolution, but when calculating on physical computers, the precision depends on the number format used and the gridsize, introducing errors in the results . Refining the grid is a potential solution to the sampling error for staircase approximations [144, 145], and when framing this problem to HRTF calculations, spectral magnitude errors of 1 dB up to 8 kHz and 2 dB up to 18 kHz can be achieved [146, 147], suggesting only negligible increase in localisation errors when listening to HRTFs calculated by the FDTD method.
Because of the additional sampling errors for irregular domains, recent advances have been made towards using quasi-cartesian grids , dynamically choosing grid resolutions , or towards the finite-volume method (FVTD), which is based on energy conservation and dissipation of the system as a whole and uses the integral formulation of the FDTD . One solution approach there is to adaptively sample the grid at the boundary and introduce unstructured or fitted cells [151, 152]. A thorough comparison between FEM, FDTD and FVTD methods is available in .
In fact, the FDTD method has been widely applied to HRTF calculations [145, 146, 154, 155], and it certainly offers the advantage of calculating broadband HRTFs while not introducing additional computational cost when multiple inputs or outputs are used. However, because of the complex geometry of the pinnae, a submillimetre sampling grid is required, resulting in the need for a delicate preprocessing.
4.3 The boundary-element method
The boundary element method (BEM) is based on a special set of test functions in the weak formulation of the Helmholtz equation Eq. (3), namely the Green’s function
where , and and are two points in space. By using this function, it is possible to reduce the weak form of the Helmholtz equation to an integral equation, i.e., the boundary integral equation (BIE), that only involves integrals over the surface of the object, and
where is obtained by the derivative of the Green’s function at a point in the direction of vector normal to the surface at this position, and denotes the strength of the sound source.
In comparison with the other two methods, the BEM has the advantage that only the
In order to solve a BEM problem, the BIE is discretized and solved by using methods such as the Galerkin, collocation or Nyström [157, 158, 159], all with the common goal of yielding a linear system of equations.
For the Galerkin method, the unknown pressure is approximated by a linear combination of ansatz functions as in Eq. (4). The BIE is again multiplied with a set of test functions (similar to the test functions used in FEM) and integrated with respect to yielding a linear system of equations as in Eq. (5), where
Another commonly used approach especially used in engineering is collocation with constant elements, i.e., the sound field is assumed to be constant on each element of the mesh, and the BIE is solved at the midpoints of each element (the set of all are called collocation nodes) yielding a linear system of equations as in Eq. (5), where
and with being the position of the sound source outside the head. The integrals over each element are numerically solved using appropriate quadrature formulas (weighted sum of function values) and
The BIE is solved for a given set of frequencies and the solutions at the collocation nodes are then used to derive the HRTFs. Note that the collocation method can be interpreted as the Galerkin method utilising the delta functionals as test functions. A thorough comparison between Galerkin and collocation approaches can be found in .
The discretisation of just the surface introduces additional challenges. First, the Green’s function becomes singular at the boundary where and special quadrature formulas need to be used close to these singularities [161, 162]; and second, the system matrix , although small, is usually densely populated, which poses a challenge for computer memory and the efficiency of the linear solver used. When considering HRTF calculations for frequencies up to 20 kHz, high-resolution meshes are required and the corresponding linear systems may contain up to 100,000 unknowns.
In order to efficiently deal with such large systems, the BEM can be coupled with methods speeding up matrix–vector multiplications, such as the fast-multipole method (FMM)  or -matrices  (so-called ‘hierarchical’ matrices). These methods have enabled an efficient and feasible calculation of HRTFs . In a nutshell, these methods aim at providing a method for an efficient and fast matrix–vector multiplication and are based on two steps. First, the elements of the mesh are grouped into clusters of approximately the same size with cluster midpoints . Second, for two clusters and , that are sufficiently apart from each other, a separable approximation of the Green’s function
is found. This approximation has two advantages: the local expansions and have to be made only once for each cluster, and the interaction between the elements of different clusters is reduced to a single interaction of the cluster midpoints. The resulting linear system of equations is then solved using an iterative equation solver .
Although the Helmholtz equation for external problems has a unique solution at all frequencies, the BIE has uniqueness problems at certain critical frequencies [159, 167]. Thus, to avoid numerical problems, the BEM needs to be stabilised at these frequencies, e.g., by using the Burton-Miller method . BEM has been widely used to calculate HRTFs [165, 168, 169, 170, 171] analysing the process from various perspectives. When applied to an accurate and high-resolution representation of the pinna geometry, BEM can yield similar results to the acoustic HRTF measurements by means of sound localisation performance [101, 172].
In principle, in order to calculate an HRTF set, the Helmholtz equation needs to be solved for every source position separately, leading in up to several thousand right-hand sides in Eq. (5). Solving that many equations cannot be done quickly even with the help of iterative solvers. On the other hand, the HRTF calculation for the second ear is quite simple because the solution obtained from the solver is already available for every element of the surface, i.e., at the element representing the ear canal of the second ear. The approach of reciprocity can help to significantly speed up the calculations by swapping the role of the many source positions with the two elements representing the ear canals, requiring us to solve Eq. (5) only twice, i.e., once for each of the ears.
Helmholtz’ reciprocity theorem states that switching source and receiver positions do not affect the observed sound pressure. When applied to HRTF calculations, virtual loudspeakers are placed in the entrance of the ear canal (replacing the virtual microphones) and the many simulated sound sources are represented by many virtual microphones (replacing the many virtual loudspeakers around the listener). By doing so, the computationally expensive part of the BEM, i.e., solving a linear system of equations to calculate the sound pressure at the surface, needs to be done only twice, namely once for each ear. Subsequently, the sound pressure at positions around the head can be calculated fairly easy and efficiently.
In more detail, assume that a point source with strength at the position causes a mean sound pressure of over a small domain with area at the entrance of the ear canal. If the domain is sufficiently small, the mean sound pressure is an accurate representation of the actual sound pressure in this domain. By applying the reciprocity, we introduce a reciprocal sound source at the entrance of the ear canal which introduces a velocity and then calculate the sound pressure at the actual sound-source position around the listener. The pressures and are linked by
The reciprocal sound source can be modelled by vibrating elements , i.e., elements with an additional velocity boundary condition
where , and is the density of air. Note that can be an arbitrary positive number because when calculating HRTFs [see for example Eq. (1)], the pressure is normalised by the reference pressure , thus cancelling . With this additional boundary condition, first, BEM can be used to calculate the sound field at the surface , and then, Green’s representation is applied to calculate the pressure at all positions of the actual sound sources ,
Note that this equation is calculated after a discretisation, and because at the surface is known from the BEM solution, the calculation of the sound pressure around the head is a simple matrix multiplication.
Reciprocity, combined with FMM-coupled BEM has been applied to calculate HRTFs, enabling calculations for a large spatial HRTF set within a few hours even on a standard desktop computer .
5. Other issues related to HRTF acquisition
Over decades, HRTFs have been collected and stored in databases. Such databases are important for educational aspects, training of neural network algorithms [34, 37] and further research [23, 25, 26, 27, 28, 173]. While in the early HRTF research days, HRTFs have been stored by each lab in a different format, since 2015, the spatially oriented format for acoustics (SOFA) is available to store HRTFs in a flexible but well-described way facilitating an easy exchange between the labs and applications. SOFA is a standard of the Audio Engineering Society under the name AES69. SOFA provides a uniform description of spatially oriented acoustic data such as HRTFs, spatial room impulse responses, and directivities .
When it comes to anthropometric data, unfortunately, there is currently no common format to specify and exchange anthropometric data. This is partially because currently, it is not known, which data are important. Some laboratories use the CIPIC parameters , some have extended them , and others have created whole new sets of parameters [128, 175]. An overview of currently used anthropometric parameters can be found in . The development of parametric pinna models may shed light on the relevance of parameters needed to be stored in the future. The listener’s geometry can also be stored in non-parametric representations such as meshes and point clouds of listener’s ears and head. To this end, typical 3D dataset formats are used, e.g., OBJ, PLY or STL. These formats are widely used in computer graphics and thus easily accessible by many corresponding applications. A large collection of HRTF databases stored in SOFA, with some of them combined with meshes stored in OBJ, PLY and STL files is available at the SOFA website.1
When HRTFs are obtained, there is strong demand to evaluate their quality. This is especially interesting when comparing the results from numerical HRTF calculations. The evaluations can be performed at various levels: geometrical, acoustical and perceptive. The evaluation at the geometric level can be done by comparing the deviation between two meshes of the pinna and representing the deviation as the Hausdorff distance . The evaluation at the acoustic level can be done by calculating the spectral distortion
where denotes the calculated and the measured one, summarised over discrete frequencies. The evaluation at the perceptual level can be simulated by means of auditory modelling  or direct performance of localisation experiment [50, 90, 147]. Especially the evaluation of localisation errors in the median plane can be relevant because the sound localisation in the median plane is directly related to the quality of monaural spectral features in the HRTFs [46, 178]. A calculated HRTF set yielding similar perceptual results as the natural listener’s HRTFs can be described as
With a specialised measurement setup, acoustic HRTF measurements can be done within a few minutes. Still, such setups are expensive and require the listener to sit or stand still for the whole measurement duration. The requirement of specialised components has been limiting the popularity of the acoustic methods. Recent advances, however, have been made by integrating head-movement tracking in systems to be used at home, especially since the commercialisation of VR headsets. These advances provide an easy-to-use measurement setup, but still need investigation on how many and which measurement positions are crucial to acquire a sufficient measurement grid for perceptually valid HRTFs.
With the availability of numerical HRTF calculations, the acquisition of personalised HRTFs has undergone significant advances. While the acoustic HRTF measurement still remains the reference acquisition method, numerical HRTF calculation paves the road towards personalised HRTFs available for a wide audience. The most widely used approaches, FEM, FDTD, BEM and BEM coupled with the FMM, when applied under optimal conditions, can yield acoustically and perceptually valid results.
Machine learning and neural networks gain increasing popularity and, in the future, may even further push the usability of numerical HRTF calculations. For example, neural networks might be able to support the photogrammetric mesh acquisition or even estimate the HRTFs directly from listener-specific anthropometric data such as photographs. Further improvements in terms of efficiency, accuracy and precision are still ongoing subject of research.
Despite the clear definition when it comes to storing an HRTF data set by means of SOFA, a similar definition for the description of anthropometric data is still not available. This might be rooted in our poor understanding of the importance of parts of the pinna and its contribution to the HRTF. Here, a clear goal is to better understand the anthropometry and its relation with HRTFs. All this future work heads into the direction of expanding the access to personalised HRTFs enabling their availability for everyone.
This work was supported by the Austrian Research Promotion Agency (FFG, project ‘softpinna’ 871263) and the European Union (EU, project ‘SONICOM’ 101017743, RIA action of Horizon 2020). We thank Harald Ziegelwanger for visualising the sound pressure in Figure 9.
Conflict of interest
The authors declare no conflict of interest.
Algazi VR, Avendano C, Duda RO. Elevation localization and head-related transfer function analysis at low frequencies. The Journal of the Acoustical Society of America. 2001; 109(3):1110-1122. DOI: 10.1121/1.1349185
Batteau DW. The role of the pinna in human localization. Proceedings of the Royal Society of London Series B. Biological Sciences. 1967; 168(1011):158-180. DOI: 10.1098/rspb.1967.0058
Baumgartner R, Reed DK, Tóth B, Best V, Majdak P, Colburn HS, et al. Asymmetries in behavioral and neural responses to spectral cues demonstrate the generality of auditory looming bias. Proceedings of the National Academy of Sciences. 2017; 114(36):9743-9748, ISSN: 0027-8424, 1091-6490. DOI: 10.1073/pnas.1703247114
Fisher HG, Freedman SJ. The role of the pinna in auditory localization. Journal of Auditory Research. 1968; 168(1011):158-180
Hebrank J, Wright D. Spectral cues used in the localization of sound sources on the median plane. The Journal of the Acoustical Society of America. 1974; 56(6):1829-1834. DOI: 10.1121/1.1903520
Musicant AD, Butler RA. The influence of pinnae-based spectral cues on sound localization. The Journal of the Acoustical Society of America. 1984; 75(4):1195-1200. DOI: 10.1121/1.390770
Majdak P, Baumgartner R, Jenny C. Formation of three-dimensional auditory space. In: Blauert J, Braasch J, editors. The Technology of Binaural Understanding, Modern Acoustics and Signal Processing. Cham, ISBN: 978-3-030-00386-9: Springer International Publishing; 2020. pp. 115-149. DOI: 10.1007/978-3-030-00386-9_5
Majdak P, Baumgartner R, Laback B. Acoustic and non-acoustic factors in modeling listener-specific performance of sagittal-plane sound localization. Frontiers in Psychology. 2014; 5:319. DOI: 10.3389/fpsyg.2014.00319
Seeber BU, Fastl H. Subjective selection of non-individual head-related transfer functions. In: Proceedings of the International Conference on Auditory Display. Atlanta, Georgia: Georgia Institute of Technology; 2003. pp. 259-262
Wenzel EM, Arruda M, Kistler DJ, Wightman FL. Localization using nonindividualized head-related transfer functions. The Journal of the Acoustical Society of America. 1993; 94(1):111-123. DOI: 10.1121/1.407089
Møller H, Sørensen MF, Hammershøi D, Jensen CB. Head-related transfer functions of human subjects. Journal of the Audio Engineering Society. 1995; 43:300-321
Macpherson EA, Middlebrooks JC. Listener weighting of cues for lateral angle: The duplex theory of sound localization revisited. The Journal of the Acoustical Society of America. 2002; 111(5 Pt 1):2219-2236. DOI: 10.1121/1.1471898
Reijniers J, Vanderelst D, Jin C, Carlile S, Peremans H. An ideal-observer model of human sound localization. Biological Cybernetics. 2014; 108(2):169-181, ISSN: 0340-1200. DOI: 10.1007/s00422-014-0588-4
Majdak P, Goupell MJ, Laback B. 3-d localization of virtual sound sources: Effects of visual environment, pointing method, and training. Attention, Perception, & Psychophysics. 2010; 72(2):454-469. DOI: 10.3758/APP.72.2.454
Majdak P, Carpentier T, Nicol R, Roginska A, Suzuki Y, Watanabe K, et al. Spatially oriented format for acoustics: A data exchange format representing head-related transfer functions. In: Proceedings of the 134th Convention of the Audio Engineering Society (AES), Page Convention Paper 8880. Roma, Italy: Audio Engineering Society; 2013
Majdak P, Hollomey C, Baumgartner R. The auditory modeling toolbox. In: The Technology of Binaural Listening. Berlin, Heidelberg: Springer; 2021. pp. 33-56
Søndergaard P, Majdak P. The auditory modeling toolbox. In: Blauert J, editor. The Technology of Binaural Listening. Berlin-Heidelberg, Germany: Springer; 2013. pp. 33-56. DOI: 10.1007/978-3-642-37762-4_2
Guezenoc C, Seguier R. HRTF individualization: A survey. In Audio Engineering Society convention 145, page Convention Paper 10129. New York, New York, United States: Audio Engineering Society; 2018
Hammershøi D, Møller H. Sound transmission to and within the human ear canal. The Journal of the Acoustical Society of America. 1996; 100(1):408-427. DOI: 10.1121/1.415856
Li S, Peissig J. Measurement of head-related transfer functions: A review. Applied Sciences. 2020; 10(14):5014. DOI: 10.3390/app101450140 Number: 14 Publisher: Multidisciplinary Digital Publishing Institute
Middlebrooks JC. Individual differences in external-ear transfer functions reduced by scaling in frequency. The Journal of the Acoustical Society of America. 1999; 106(3):1480-1492. DOI: 10.1121/1.427176
Iida K, Aizaki T, Kikuchi T. Toolkit for individualization of head-related transfer functions using parametric notch-peak model. Applied Acoustics. 2022; 189:108610. DOI: 10.1016/j.apacoust.2021.108610
Torres-Gallegos EA, Orduna-Bustamante F, Arámbula-Cosío F. Personalization of head-related transfer functions (HRTF) based on automatic photo-anthropometry and inference from a database. Applied Acoustics. 2015; 97:84-95. DOI: 10.1016/j.apacoust.2015.04.009
Guezenoc C, Seguier R. A wide dataset of ear shapes and pinna-related transfer functions generated by random ear drawings. The Journal of the Acoustical Society of America. 2020; 147(6):4087-4096. DOI: 10.1121/10.0001461
Jin CT, Zolfaghari R, Long X, Sebastian A, Hossain S, Glaunés J, et al. Considerations regarding individualization of head-related transfer functions. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Calgary, AB, Canada: IEEE; 2018. pp. 6787-6791. DOI: 10.1109/ICASSP.2018.8462613
Lu D, Zeng X, Guo X, Wang H. Personalization of head-related transfer function based on sparse principle component analysis and sparse representation of 3d anthropometric parameters. Australia: Acoustics; 2019. pp. 1-10. DOI: 10.1007/s40857-019-00169-y
Tommasini FC, Ramos OA, Hüg MX, Bermejo F. Usage of spectral distortion for objective evaluation of personalized hrtf in the median plane. International Journal of Acoustics & Vibration. 2015; 20(2):81-89
Zhang M, Ge Z, Liu T, Wu X, Qu T. Modeling of individual HRTFs based on spatial principal component analysis. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2020; 28:785-797. DOI: 10.1109/TASLP.2020.2967539
Zhang M, Kennedy R, Abhayapala T, Zhang W. Statistical method to identify key anthropometric parameters in HRTF individualization. In: 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays. Edinburgh, Scotland: IEEE; 2011. pp. 213-218. DOI: 10.1109/HSCMA.2011.5942401
Hu H, Zhou L, Zhang J, Ma H, Wu Z. Head related transfer function personalization based on multiple regression analysis. In: 2006 International Conference on Computational Intelligence and Security. Vol. 2. Guangzhou, China: IEEE; 2006. pp. 1829-1832. DOI: 10.1109/ICCIAS.2006.295380
Huang Q, Zhuang Q. HRIR personalisation using support vector regression in independent feature space. Electronics Letters. 2009; 45(19):1002-1003
Zolfaghari R, Epain N, Jin CT, Glaunes J, Tew A. Large deformation diffeomorphic metric mapping and fast-multipole boundary element method provide new insights for binaural acoustics. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). London: IEEE; 2014. pp. 2863-2867. DOI: 10.1109/ICASSP.2014.6854123
Grijalva F, Martini LC, Florencio D, Goldenstein S. Interpolation of head-related transfer functions using manifold learning. IEEE Signal Processing Letters. 2017; 24(2):221-225. DOI: 10.1109/LSP.2017.2648794
Gebru ID, Marković D, Richard A, Krenn S, Butler GA, De la Torre F, et al. Implicit HRTF modeling using temporal convolutional networks. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore: IEEE; 2021. pp. 3385-3389. DOI: 10.1109/ICASSP39728.2021.9414750
Grijalva F, Martini L, Goldenstein S, Florencio D. Anthropometric-based customization of head-related transfer functions using isomap in the horizontal plane. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). USA: IEEE; 2014. pp. 4473-4477. DOI: 10.1109/ICASSP.2014.6854448
Hu H, Zhou L, Ma H, Wu Z. HRTF personalization based on artificial neural network in individual virtual auditory space. Applied Acoustics. 2008; 69(2):163-172. DOI: 10.1016/j.apacoust.2007.05.007
Lee GW, Lee JH, Kim SJ, Kim HK. Directional audio rendering using a neural network based personalized HRTF. In INTERSPEECH, Brno, Czech Republic. pp. 2364–2365
Li L, Huang Q. HRTF personalization modeling based on RBF neural network. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada: IEEE; 2013. pp. 3707-3710. DOI: 10.1109/ICASSP.2013.6638350
Miccini R, Spagnol S. A hybrid approach to structural modeling of individualized HRTFs. In: 2021 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW). Lisbon, Portugal: IEEE; 2021. pp. 80-85. DOI: 10.1109/VRW52623.2021.00022
Shu-Nung Y, Collins T, Liang C. Head-related transfer function selection using neural networks. Archives of Acoustics. 2017; 42(3):365-373. DOI: 10.1515/aoa-2017-0038
Zhou Y, Jiang H, Ithapu VK. On the predictability of HRTFs from ear shapes using deep networks. In: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). London: IEEE; 2021. pp. 441-445. DOI: 10.1109/ICASSP39728.2021.9414042
Bilinski P, Ahrens J, Thomas MR, Tashev IJ, Platt JC. HRTF magnitude synthesis via sparse representation of anthropometric features. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). London: IEEE; 2014. pp. 4468-4472. DOI: 10.1109/ICASSP.2014.6854447
Ghorbal S, Auclair T, Soladie C, Seguier R. Pinna morphological parameters influencing HRTF sets. In: Proceedings of the 20th International Conference on Digital Audio Effects (DAFx-17). Edinburgh: University of Edinburgh; 2017. pp. 353-359
Mokhtari P, Takemoto H, Nishimura R, Kato H. Vertical normal modes of human ears: Individual variation and frequency estimation from pinna anthropometry. The Journal of the Acoustical Society of America. 2016; 140(2):814-831. DOI: 10.1121/1.4960481
Onofrei MG, Miccini R, Unnthorsson R, Serafin S, Spagnol S. 3d ear shape as an estimator of HRTF notch frequency. In: 17th Sound and Music Computing Conference. Torino: Sound and Music Computing Network; 2020. pp. 131-137. DOI: 10.5281/zenodo.3898720
Spagnol S, Geronazzo M, Avanzini F. On the relation between pinna reflection patterns and head-related transfer function features. IEEE Transactions on Audio, Speech, and Language Processing. 2012; 21(3):508-519. DOI: 10.1109/TASL.2012.2227730
Pollack K, Majdak P, Furtado H. A parametric pinna model for the calculations of head-related transfer functions. In: Proceedings of Forum Acusticum. Lyon. 2020. pp. 1357-1360. DOI: 10.48465/fa.2020.02800
Stitt P, Katz BFG. Sensitivity analysis of pinna morphology on head-related transfer functions simulated via a parametric pinna model. The Journal of the Acoustical Society of America. 2021; 149(4):2559-2572, ISSN: 0001-4966. DOI: 10.1121/10.0004128
Katz BF, Parseihian G. Perceptually based head-related transfer function database optimization. The Journal of the Acoustical Society of America. 2012; 131(2):EL99-EL105. DOI: 10.1121/1.3672641
Baumgartner R, Majdak P, Laback B. Modeling sound-source localization in sagittal planes for human listeners. The Journal of the Acoustical Society of America. 2014; 136(2):791-802. DOI: 10.1121/1.4887447
Xie B, Zhong X, He N. Typical data and cluster analysis on head-related transfer functions from chinese subjects. Applied Acoustics. 2015; 94:1-13. DOI: 10.1016/j.apacoust.2015.01.022
Toppila E, Pyykkö I, Starck J. Age and noise-induced hearing loss. Scandinavian Audiology. 2001; 30(4):236-244. DOI: 10.1080/01050390152704751
Klumpp RG, Eady HR. Some measurements of interaural time difference thresholds. The Journal of the Acoustical Society of America. 1956; 28:859-860. DOI: 10.1121/1.1908493
Blauert J. Spatial hearing. In: The Psychophysics of Human Sound Localization. Cambridge, MA: The MIT Press; 1997
Raykar VC, Duraiswami R, Yegnanarayana B. Extracting the frequencies of the pinna spectral notches in measured head related impulse responses. The Journal of the Acoustical Society of America. 2005; 118(1):364-374. DOI: 10.1121/1.1923368
Takemoto H, Mokhtari P, Kato H, Nishimura R, Iida K. Mechanism for generating peaks and notches of head-related transfer functions in the median plane. The Journal of the Acoustical Society of America. 2012; 132(6):3832-3841. DOI: 10.1121/1.4765083
Algazi VR, Duda RO, Duraiswami R, Gumerov NA, Tang Z. Approximating the head-related transfer function using simple geometric models of the head and torso. The Journal of the Acoustical Society of America. 2002; 112(5):2053-2064. DOI: 10.1121/1.1508780
Macpherson EA, Middlebrooks JC. Vertical-plane sound localization probed with ripple-spectrum noise. The Journal of the Acoustical Society of America. 2003; 114(1):430-445. DOI: 10.1121/1.1582174
Goupell MJ, Majdak P, Laback B. Median-plane sound localization as a function of the number of spectral channels using a channel vocoder. The Journal of the Acoustical Society of America. 2010; 127(2):990-1001. DOI: 10.1121/1.3283014
Kulkarni A, Colburn HS. Role of spectral detail in sound-source localization. Nature. 1998; 396(6713):747-749. DOI: 10.1038/25526
Senova MA, McAnally KI, Martin RL. Localization of virtual sound as a function of head-related impulse response duration. Journal of the Audio Engineering Society. 2002; 50(1/2):57-66
Thavam S, Dietz M. Smallest perceivable interaural time differences. The Journal of the Acoustical Society of America. 2019; 145(1):458-468. DOI: 10.1121/1.5087566
Andreopoulou A, Katz BF. Identification of perceptually relevant methods of inter-aural time difference estimation. The Journal of the Acoustical Society of America. 2017; 142(2):588-598. DOI: 10.1121/1.4996457
Katz BF, Noisternig M. A comparative study of interaural time delay estimation methods. The Journal of the Acoustical Society of America. 2014; 135(6):3530-3540. DOI: 10.1121/1.4875714
Algazi R, Avendano C, Duda RO. Estimation of a spherical-head model from anthropometry. Journal of the Audio Engineering Society. 2001; 49:472-479
Zhang W, Abhayapala TD, Kennedy RA, Duraiswami R. Insights into head-related transfer function: Spatial dimensionality and continuous representation. The Journal of the Acoustical Society of America. 2010; 127(4):2347-2357. DOI: 10.1121/1.3336399
Bomhardt R, de la Fuente Klein M, Fels J. A high-resolution head-related transfer function and three-dimensional ear model database. In: Proceedings of Meetings on Acoustics 172ASA. Vol. 29. Illinois, United States: ASA; 2016. p. 050002. DOI: 10.1121/2.0000467
Carpentier T, Bahu H, Noisternig M, Warusfel O. Measurement of a head-related transfer function database with high spatial resolution. In: 7th Forum Acusticum (EAA). Ukraine: EAA; 2014
Jin CT, Guillon P, Epain N, Zolfaghari R, Van Schaik A, Tew AI, et al. Creating the Sydney York morphological and acoustic recordings of ears database. IEEE Transactions on Multimedia. 2013; 16(1):37-46. DOI: 10.1109/TMM.2013.2282134
Mills AW. On the minimum audible angle. The Journal of the Acoustical Society of America. 1958; 30(4):237-246. DOI: 10.1121/1.1909553
Wersényi G. HRTFs in human localization: Measurement, spectral evaluation and practical use in virtual audio environment. Dissertation. Cottbus, Germany: Brandenburg University of Technology; 2002
Zhong X, Xie B, et al. Head-related transfer functions and virtual auditory display. In: Soundscape Semiotics-Localization and Categorization. Plantation, FL, United States: J. Ross Publishing; 2014. p. 1. DOI: 10.5772/56907
Makous JC, Middlebrooks JC. Two-dimensional sound localization by human listeners. The Journal of the Acoustical Society of America. 1990; 87(5):2188-2200. DOI: 10.1121/1.399186
Middlebrooks JC. Spectral shape cues for sound localization. In: Binaural and Spatial Hearing in Real and Virtual Environments. New York: Psychology Press; 1997. pp. 77-97
Middlebrooks JC. Virtual localization improved by scaling nonindividualized external-ear transfer functions in frequency. The Journal of the Acoustical Society of America. 1999; 106(3):1493-1510. DOI: 10.1121/1.427147
Perrott DR, Saberi K. Minimum audible angle thresholds for sources varying in both elevation and azimuth. The Journal of the Acoustical Society of America. 1990; 87(4):1728-1731, ISSN: 0001-4966. DOI: 10.1121/1.399421
Middlebrooks JC, Green DM. Sound localization by human listeners. Annual Review of Psychology. 1991; 42(1):135-159. DOI: 10.1146/annurev.ps.42.020191.001031
Poirier P, Miljours S, Lassonde M, Lepore F. Sound localization in acallosal human listeners. Brain. 1993; 116(1):53-69. DOI: 10.1093/brain/116.1.53
Voss P, Lassonde M, Gougoux F, Fortin M, Guillemot J-P, Lepore F. Early- and late-onset blind individuals show supra-normal auditory abilities in far-space. Current Biology. 2004; 14(19):1734-1738. DOI: 10.1016/j.cub.2004.09.051
Senn P, Kompis M, Vischer M, Haeusler R. Minimum audible angle, just noticeable interaural differences and speech intelligibility with bilateral cochlear implants using clinical speech processors. Audiology and Neurotology. 2005; 10(6):342-352. DOI: 10.1159/000087351
Pulkki V. Localization of amplitude-panned virtual sources II: Two- and three-dimensional panning. Journal of the Audio Engineering Society. 2001; 49(4):753-767
Bremen P, van Wanrooij MM, van Opstal AJ. Pinna cues determine orienting response modes to synchronous sounds in elevation. Journal of Neuroscience. 2010; 30(1):194-204. DOI: 10.1523/JNEUROSCI.2982-09.2010
Brimijoin WO, Akeroyd MA. The moving minimum audible angle is smaller during self motion than during source motion. Frontiers in Neuroscience. 2014; 8:273. DOI: 10.3389/fnins.2014.00273
Begault DR, Wenzel EM, Anderson MR. Direct comparison of the impact of head tracking, reverberation, and individualized head-related transfer functions on the spatial perception of a virtual speech source. Journal of the Audio Engineering Society. 2001; 49(10):904-916
Stitt P, Hendrickx E, Messonnier J, Katz B. The role of head tracking in binaural rendering. In: 29th Tonmeistertagung, International VDT Convention. Germany: CCN Cologne; 2016
Urbanietz C, Enzner G. Binaural rendering of dynamic head and sound source orientation using high-resolution HRTF and retarded time. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Calgary, AB, Canada: IEEE; 2018. pp. 566-570. DOI: 10.1109/ICASSP.2018.8461343
Pörschmann C, Arend JM. Obtaining dense HRTF sets from sparse measurements in reverberant environments. In: Audio Engineering Society Conference: 2019 AES International Conference on Immersive and Interactive Audio. New York, New York, United States: Audio Engineering Society; 2019
Algazi VR, Duda RO, Thompson DM, Avendano C. The CIPIC HRTF database. In: Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575). New York: IEEE; 2001. pp. 99-102. DOI: 10.1109/ASPAA.2001.9695520
Pelzer R, Dinakaran M, Brinkmann F, Lepa S, Grosche P, Weinzierl S. Head-related transfer function recommendation based on perceptual similarities and anthropometric features. The Journal of the Acoustical Society of America. 2020; 148(6):3809-3817. DOI: 10.1121/10.0002884
Ziegelwanger H, Reichinger A, Majdak P. Calculation of listener-specific head-related transfer functions: Effect of mesh quality. In: Proceedings of Meetings on Acoustics. Vol. 19. Montreal, Canada. 2013. p. 050017. DOI: 10.1121/1.4799868
Gardner MB, Gardner RS. Problem of localization in the median plane: Effect of pinnae cavity occlusion. The Journal of the Acoustical Society of America. 1973; 53(2):400-408. DOI: 10.1121/1.1913336
Nelson PA, Kahana Y. Spherical harmonics, singular-value decomposition and head-related transfer function. Journal of Sound and Vibration. 2001; 239:607-637. DOI: 10.1006/jsvi.2000.3227
Shaw EAG. The external ear. In: Keidel WD, Neff WD, editors. Auditory System. Vol. 5/1. Berlin Heidelberg, ISBN: 978-3-642-65831-0 978-3-642-65829-7: Springer; 1974. pp. 455-490. DOI: 10.1007/978-3-642-65829-7_14
Brinkmann F. The FABIAN head-related transfer function data base. Berlin: Technische Universität Berlin; 2017. DOI: 10.14279/depositonce-5718
Brinkmann F, Dinakaran M, Pelzer R, Grosche P, Voss D, Weinzierl S. A cross-evaluated database of measured and simulated HRTFs including 3D head meshes, anthropometric features, and headphone impulse responses. Journal of the Audio Engineering Society. 2019; 67(9):705-718. DOI: 10.17743/jaes.2019.0024
Ghorbal S, Bonjour X, Séguier R. Computed HRIRs and ears database for acoustic research. In: Audio Engineering Society Convention 148. New York, New York, United States: Audio Engineering Society; 2020
Katz BF. Acoustic absorption measurement of human hair and skin within the audible frequency range. The Journal of the Acoustical Society of America. 2000; 108(5 Pt 1):2238-2242. DOI: 10.1121/1.1314319
Treeby BE, Pan J, Paurobally RM. An experimental study of the acoustic impedance characteristics of human hair. The Journal of the Acoustical Society of America. 2007; 122(4):2107-2117. DOI: 10.1121/1.2773946
Brinkmann F, Lindau A, Weinzierl S. On the authenticity of individual dynamic binaural synthesis. The Journal of the Acoustical Society of America. 2017; 142(4):1784-1795, ISSN: 0001-4966. DOI: 10.1121/1.5005606
Brinkmann F, Lindau A, Weinzierl S, Müller-Trapet M, Opdam R, Vorländer M, et al. A high resolution and full-spherical head-related transfer function database for different head-above-torso orientations. Journal of the Audio Engineering Society. 2017; 65(10):841-848. DOI: 10.17743/jaes.2017.0033
Ziegelwanger H, Majdak P, Kreuzer W. Numerical calculation of listener-specific head-related transfer functions and sound localization: Microphone model and mesh discretization. The Journal of the Acoustical Society of America. 2015; 138(1):208-222, ISSN: 0001-4966. DOI: 10.1121/1.4922518
Zotkin DN, Duraiswami R, Grassi E, Gumerov NA. Fast head-related transfer function measurement via reciprocity. The Journal of the Acoustical Society of America. 2006; 120(4):2202-2215. DOI: 10.1121/1.2207578
Carlile S, Leong P, Hyams S. The nature and distribution of errors in sound localization by human listeners. Hearing Research. 1997; 114(1–2):179-196. DOI: 10.1016/S0378-5955(97)00161-5
Masiero B, Pollow M, Fels J. Design of a fast broadband individual head-related transfer function measurement system. Vol. 97. Hirzel: Acustica; 2011. pp. 136-136
Bau D, Lübeck T, Arend JM, Dziwis D, Pörschmann C. Simplifying head-related transfer function measurements: A system for use in regular rooms based on free head movements. In: 8th International Conference of Immersive and 3D Audio. Bologna, Italy: I3DA; 2021
Reijniers J, Partoens B, Steckel J, Peremans H. HRTF measurement by means of unsupervised head movements with respect to a single fixed speaker. Vol. 8. London: IEEE Access; 2020. pp. 92287-92300, ISSN: 2169–3536. DOI: 10.1109/ACCESS.2020.2994932
Fukudome K, Suetsugu T, Ueshin T, Idegami R, Takeya K. The fast measurement of head related impulse responses for all azimuthal directions using the continuous measurement method with a servo-swiveled chair. Applied Acoustics. 2007; 68(8):864-884. DOI: 10.1016/j.apacoust.2006.09.009
He J, Ranjan R, Gan W-S, Chaudhary NK, Hai ND, Gupta R. Fast continuous measurement of HRTFs with unconstrained head movements for 3d audio. Journal of the Audio Engineering Society. 2018; 66(11):884-900. DOI: 10.17743/jaes.2018.0050
Richter J-G, Fels J. On the influence of continuous subject rotation during high-resolution head-related transfer function measurements. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2019; 27(4):730-741. DOI: 10.1109/TASLP.2019.2894329
Pulkki V, Laitinen M-V, Sivonen V. HRTF measurements with a continuously moving loudspeaker and swept sines. In: Audio Engineering Society Convention 128. New York, New York, United States: Audio Engineering Society; 2010
Kabzinski T, Jax P. Towards faster continuous multi-channel HRTF measurements based on learning system models. In: 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore: IEEE; 2021 arXiv preprint arXiv:2110.03630
Majdak P, Balazs P, Laback B. Multiple exponential sweep method for fast measurement of head-related transfer functions. Journal of the Audio Engineering Society. 2007; 55:623-637
Middlebrooks JC, Makous JC, Green DM. Directional sensitivity of sound-pressure levels in the human ear canal. The Journal of the Acoustical Society of America. 1989; 86(1):89-108. DOI: 10.1121/1.398224
Wightman F, Kistler D, Foster S, Abel J. A comparison of head-related transfer functions measured deep in the ear canal and at the ear canal entrance. In: 17th Midwinter Meeting of the Association for Research in Otolaryngology. Vol. 71. Montreal: ARO; 1995
Zahorik P. Limitations in using golay codes for head-related transfer function measurement. The Journal of the Acoustical Society of America. 2000; 107(3):1793-1796. DOI: 10.1121/1.428579
Dietrich P, Masiero B, Vorländer M. On the optimization of the multiple exponential sweep method. Journal of the Audio Engineering Society. 2013; 61(3):113-124
Armstrong C, Thresh L, Murphy D, Kearney G. A perceptual evaluation of individual and non-individual HRTFs: A case study of the SADIE II database. Applied Sciences. 2018; 8(11):2029. DOI: 10.3390/app8112029
Denk F, Kollmeier B, Ewert SD. Removing reflections in semianechoic impulse responses by frequency-dependent truncation. Journal of the Audio Engineering Society. 2018; 66(3):146-153. DOI: 10.17743/jaes.2018.0002
Kistler DJ, Wightman FL. A model of head-related transfer functions based on principal components analysis and minimum-phase reconstruction. The Journal of the Acoustical Society of America. 1992; 91(3):1637-1647. DOI: 10.1121/1.402444
Kohlrausch A, Breebaart J. Perceptual (ir) relevance of HRTF magnitude and phase spectra. In: Audio Engineering Society Convention 110. New York, New York, United States: Audio Engineering Society; 2001
Bergman DR. Computational Acoustics: Theory and Implementation. Hoboken, New Jersey, United States: John Wiley & Sons; 2018
Marburg S. Six boundary elements per wavelength. Is that enough? Journal of Computational Acoustics. 2002; 10:25-51. DOI: 10.1142/S0218396X02001401
Botsch M, Kobbelt L. A remeshing approach to multiresolution modeling. In: Proceedings of the 2004 Eurographics/ACM SIGGRAPH Symposium on Geometry Processing. New York, NY, United States: Association for Computing Machinery; 2004. pp. 185-192. DOI: 10.1145/1057432.1057457
Reichinger A, Majdak P, Sablatnig R, Maierhofer S. Evaluation of methods for optical 3-D scanning of human pinnas. In: Proceedings of the 3D Vision Conference. Seattle, WA: IEEE; 2013. pp. 390-397. DOI: 10.1109/3DV.2013.58
Dinakaran M, Brinkmann F, Harder S, Pelzer R, Grosche P, Paulsen RR, et al. Perceptually motivated analysis of numerically simulated head-related transfer functions generated by various 3d surface scanning systems. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Calgary, Alberta, Canada: IEEE; 2018. pp. 551-555. DOI: 10.1109/ICASSP.2018.8461789
Greff R, Katz BF. Round robin comparison of HRTF simulation systems: Preliminary results. In: Audio Engineering Society Convention 123, Page Convention Paper 7188. New York, New York, United States: Audio Engineering Society; 2007
Dellepiane M, Pietroni N, Tsingos N, Asselot M, Scopigno R. Reconstructing head models from photographs for individualized 3d-audio processing. In: Computer Graphics Forum. Vol. 27. Hoboken, New Jersey, United States: Wiley Online Library; 2008. pp. 1719-1727. DOI: 10.1111/j.1467-8659.2008.01316.x
Iida K, Nishiyama O, Aizaki T. Estimation of the category of notch frequency bins of the individual head-related transfer functions using the anthropometry of the listener’s pinnae. Applied Acoustics. 2021; 177:107929. DOI: 10.1016/j.apacoust.2021.107929
Pollack K, Brinkmann F, Majdak P, Kreuzer W. Von Fotos zu personalisierter räumlicher Audiowiedergabe [from photos to personalised spatial audio playback]. e & i Elektrotechnik und Informationstechnik. 2021; 138(3):1-6. DOI: 10.1007/s00502-021-00891-4
Ullman S, Brenner S. The interpretation of structure from motion. Proceedings of the Royal Society of London. Series B. Biological Sciences. 1979; 203(1153):405-426. DOI: 10.1098/rspb.1979.0006 Publisher: Royal Society
Sommerfeld A. Partial Differential Equations in Physics. Cambridge, Massachusetts, United States: Academic Press; 1949
Turner MJ, Clough RW, Martin HC, Topp L. Stiffness and deflection analysis of complex structures. Journal of the Aeronautical Sciences. 1956; 23(9):805-823. DOI: 10.2514/8.3664
Bériot H, Prinn A, Gabard G. Efficient implementation of high-order finite elements for Helmholtz problems. International Journal for Numerical Methods in Engineering. 2016; 106(3):213-240. DOI: 10.1002/nme.5172
Gabard G, Bériot H, Prinn A, Kucukcoskun K. Adaptive, high-order finite-element method for convected acoustics. AIAA Journal. 2018; 56(8):3179-3191. DOI: 10.2514/1.J057054
Ueberhuber CW. Numerical Computation 1: Methods, Software, and Analysis. Vol. 16. Berlin, Germany: Springer Science & Business Media; 1997
Beriot H, Modave A. An automatic perfectly matched layer for acoustic finite element simulations in convex domains of general shape. International Journal for Numerical Methods in Engineering. 2021; 122(5):1239-1261. DOI: 10.1002/nme.6560
Farahikia M, Su QT. Optimized finite element method for acoustic scattering analysis with application to head-related transfer function estimation. Journal of Vibration and Acoustics. 2017; 139(3):034501. DOI: 10.1115/1.4035813
Harder S, Paulsen RR, Larsen M, Laugesen S, Mihocic M, Majdak P. A framework for geometry acquisition, 3-D printing, simulation, and measurement of head-related transfer functions with a focus on hearing-assistive devices. Computer Aided Design. 2016; 75-76:39-46, ISSN: 0010-4485. DOI: 10.1016/j.cad.2016.02.006
Huttunen T, Seppälä ET, Kirkeby O, Kärkkäinen A, Kärkkäinen L. Simulation of the transfer function for a head-and-torso model over the entire audible frequency range. Journal of Computational Acoustics. 2007; 15(04):429-448. DOI: 10.1142/S0218396X07003469
Kahana Y. Numerical Modelling of the Head-Related Transfer Function. Southampton, UK: University of Southampton; 2000
Ma F, Wu JH, Huang M, Zhang W, Hou W, Bai C. Finite element determination of the head-related transfer function. Journal of Mechanics in Medicine and Biology. 2015; 15(05):1550066. DOI: 10.1142/S0219519415500669
Yee K. Numerical solution of initial boundary value problems involving Maxwell’s equations in isotropic media. IEEE Transactions on Antennas and Propagation. 1966; 14(3):302-307. DOI: 10.1109/TAP.1966.1138693
Botts J, Savioja L. Spectral and pseudospectral properties of finite difference models used in audio and room acoustics. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2014; 22(9):1403-1412. DOI: 10.1109/TASLP.2014.2332045
Häggblad J, Runborg O. Accuracy of staircase approximations in finite-difference methods for wave propagation. Numerische Mathematik. 2014; 128(4):741-771. DOI: 10.1007/s00211-014-0625-1
Prepeliţă ST, Geronazzo M, Avanzini F, Savioja L. Influence of voxelization on finite difference time domain simulations of head-related transfer functions. The Journal of the Acoustical Society of America. 2016; 139(5):2489-2504. DOI: 10.1121/1.4947546
Prepeliţă ST, Gómez Bolaños J, Geronazzo M, Mehra R, Savioja L. Pinna-related transfer functions and lossless wave equation using finite-difference methods: Verification and asymptotic solution. The Journal of the Acoustical Society of America. 2019; 146(5):3629-3645. DOI: 10.1121/1.5131245
Prepeliţă ST, Gómez Bolaños J, Geronazzo M, Mehra R, Savioja L. Pinna-related transfer functions and lossless wave equation using finite-difference methods: Validation with measurements. The Journal of the Acoustical Society of America. 2020; 147(5):3631-3645. DOI: 10.1121/10.0001230
Botteldooren D. Acoustical finite-difference time-domain simulation in a quasi-cartesian grid. The Journal of the Acoustical Society of America. 1994; 95(5):2313-2319. DOI: 10.1121/1.409866
Willemsen S, Bilbao S, Ducceschi M, Serafin S. Dynamic grids for finite-difference schemes in musical instrument simulations. In: 24th International Conference on Digital Audio Effects. Vienna, Austria: DAFX; 2021. pp. 144-151
Bilbao S. Modeling of complex geometries and boundary conditions in finite difference/finite volume time domain room acoustics simulation. IEEE Transactions on Audio, Speech, and Language Processing. 2013; 21(7):1524-1533. DOI: 10.1109/TASL.2013.2256897
Bilbao S, Hamilton B. Passive volumetric time domain simulation for room acoustics applications. The Journal of the Acoustical Society of America. 2019; 145(4):2613-2624. DOI: 10.1121/1.5095876
Bilbao S, Hamilton B, Botts J, Savioja L. Finite volume time domain room acoustics simulation under general impedance boundary conditions. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2015; 24(1):161-173. DOI: 10.1109/TASLP.2015.25000180
Peiró, J. Sherwin S. Finite difference, finite element and finite volume methods for partial differential equations. In Handbook of Materials Modeling. Berlin, Germany: Springer; 2005. pp. 2415–2446. DOI: 10.1007/978-1-4020-3286-8_127
Mokhtari P, Takemoto H, Nishimura R, Kato H. Frequency and amplitude estimation of the first peak of head-related transfer functions from individual pinna anthropometry. The Journal of the Acoustical Society of America. 2015; 137(2):690-701. DOI: 10.1121/1.4906160
Xiao T, Huo Liu Q. Finite difference computation of head-related transfer function for human hearing. The Journal of the Acoustical Society of America. 2003; 113(5):2434-2441, ISSN: 0001-4966. DOI: 10.1121/1.1561495
Gumerov NA, O’Donovan AE, Duraiswami R, Zotkin DN. Computation of the head-related transfer function via the fast multipole accelerated boundary element method and its spherical harmonic representation. The Journal of the Acoustical Society of America. 2010; 127(1):370-386. DOI: 10.1121/1.3257598
Galerkin BG. Rods and plates. Series occurring in various questions concerning the elastic equilibrium of rods and plates. Engineers Bulletin (Vestnik Inzhenerov). 1915; 19:897-908
Nyström EJ. Über die praktische Auflösung von Integralgleichungen mit Anwendungen auf Randwertaufgaben [about the practical solution of integral equations with applications to boundary value problems]. Acta Mathematica. 1930; 54:185-204. DOI: 10.1007/BF02547521
Sauter S, Schwab S. Boundary Element Methods. Berlin, Germany: Springer; 2011
Arnold DN, Wendland WL. Collocation versus Galerkin procedures for boundary integral methods. In: Brebbia CA, editor. Boundary Element Methods in Engineering. Berlin, Germany ISBN: 978-3-662-11275-5: Springer International Publishing; 1982. DOI: 10.1007/978-3-662-11273-1_2
Duffy MG. Quadrature over a pyramid or cube of integrands with a singularity at a vertex. SIAM Journal on Numerical Analysis. 1982; 19(6):1260-1262. DOI: 10.1137/0719090
Krishnasamy G, Schmerr L, Rudolphi T, Rizzo F. Hypersingular boundary integral equations: Some applications in acoustic and elastic wave scattering. Transactions of the ASME. 1990; 57:404-414. DOI: 10.1115/1.2892004
Coifman R, Rokhlin V, Wandzura S. The fast multipole method for the wave equations: A pedestrian prescription. IEEE Antennas and Propagation Magazine. 1993; 35(3):7-12, ISSN: 1045-9243. DOI: 10.1109/74.250128
Hackbusch W. Hierarchical Matrices: Algorithms and Analysis. Berlin, Heidelberg: Springer; 2015. DOI: 10.1007/978-3-662-47324-5
Kreuzer W, Majdak P, Chen Z. Fast multipole boundary element method to calculate head-related transfer functions for a wide frequency range. The Journal of the Acoustical Society of America. 2009; 126(3):1280-1290. DOI: 10.1121/1.3177264
Saad Y. Iterative Methods for Sparse Linear Systems. New Delhi, India: SIAM; 2003
Burton AJ, Miller GF. The application of integral equation methods to the numerical solution of some exterior boundary-value problems. Proceedings of the Royal Society of London A. Mathematical and Physical Sciences. 1971; 323(1553):201-210, ISSN: 0080-4630. DOI: 10.1098/rspa.1971.0097
Katz BF. Boundary element method calculation of individual head-related transfer function. I. Rigid model calculation. The Journal of the Acoustical Society of America. 2001; 110(5 Pt 1):2440-2448. DOI: 10.1121/1.1412440
Katz BF. Boundary element method calculation of individual head-related transfer function. II. Impedance effects and comparisons to real measurements. The Journal of the Acoustical Society of America. 2001; 110(5 Pt 1):2449-2455. DOI: 10.1121/1.1412441
Otani M, Ise S. A fast calculation method of the head-related transfer functions for multiple source points based on the boundary element method. Acoustical Science and Technology. 2003; 24(5):259-266. DOI: 10.1250/ast.24.259
Otani M, Ise S. Fast calculation system specialized for head-related transfer function based on boundary element method. The Journal of the Acoustical Society of America. 2006; 119(5 Pt 1):2589-2598, ISSN: 0001-4966. DOI: 10.1121/1.2191608
Ziegelwanger H, Kreuzer W, Majdak P. Mesh2HRTF: Open-source software package for the numerical calculation of head-related transfer functions. In Proceedings of the 22nd International Congress on Sound and Vibration, 1–8, IEEE Florence, IT. 2015. DOI: 10.13140/RG.2.1.1707.1128
Fink KJ, Ray L. Individualization of head related transfer functions using principal component analysis. Applied Acoustics. 2015; 87:162-173. DOI: 10.1016/j.apacoust.2014.07.005
Xie B, Zhong X, Rao D, Liang Z. Head-related transfer function database and its analyses. Science in China Series G: Physics, Mechanics and Astronomy. 2007; 50(3):267-280, ISSN: 1672-1799, 1862-2844. DOI: 10.1007/s11433-007-0018-x
Nishino T, Inoue N, Takeda K, Itakura F. Estimation of HRTFs on the horizontal plane using physical features. Applied Acoustics. 2007; 68(8):897-908, ISSN: 0003-682X. DOI: 10/dr4tg3
Xie B. Head-Related Transfer Function and Virtual Auditory Display. Plantation, FL, United States: J. Ross Publishing; 2013
Gromov M. Metric structures for Riemannian and non-Riemannian spaces. Bulletin of the American Mathematical Society. 2001; 38:353-363
Hebrank J, Wright D. Are two ears necessary for localization of sound sources on the median plane? The Journal of the Acoustical Society of America. 1974; 56(3):935-938. DOI: 10.1121/1.1903351