Frequency calculation for pick plucked guitar notes by octave method.
Music is the pulse of human lives and is an amazing tool to relieve and re-live. And when it comes to the signal processing, impulse is the pulse of the researchers. The work presented here is focused on impulse response modeling of noted produced by box shaped acoustic guitar. The impulse response is very fundamental behavior of any system. The music note is the convolution of the impulse response and the excitation signal of that guitar. The frequency of the generated music note follows the octave rule. The octave rule can be checked for impulse responses as well. If the excitation signal and impulse response are separated, then an impulse response of a single fret can be used to generate the impulse responses of other frets. Here the music notes are analyzed and synthesized on the basis of the plucking style and plucking expression of the guitar-player. If the impulse response of the musical instrument is known, the output music note can be synthesized in an unusual manner. Researchers have been able to estimate the impulse response by breaking the string of the guitar. Estimating the impulse response from the recorded music notes is possible using the methodology of cepstral domain window. By means of the Adaptive Cepstral Domain Window (ACDW) the author estimated the impulse response of guitar notes. The work has been further extended towards the classification of synthesized notes for plucking style and plucking expression using Neural Network and Machine Learning algorithms.
- impulse response
- acoustic guitar
- octave rule
- adaptive cepstral domain window
The work starts with analysis of different ways of mathematical modeling of music instruments. The aim is to bridge the gap between the synthesized music note and the original note. Efforts have been taken to propose a model to preserve some of the oldest and ‘on the verge of obsolete’ music instruments. Past researchers developed various instruments’ models. One of the models included attack, decay, sustain and release (ADSR) parameter-based model. Figure 1 shows the ADSR graph. These parameters measure the time required for complete music note generation. The ADSR is also known as ‘timbre’ of the music note and is helpful for differentiating instrument families. Perception of the music note occurs with the timbre i.e. envelope and the fundamental frequency i.e. ‘pitch’ along with its harmonics. The fullness of the music notes is perceived by these harmonic frequencies. The Fast Fourier Transform (FFT) is the frequency domain representation of the any signal (here, the music notes). In the early days of modeling, the pitch and timbre set the foundation for the current development in the instrument modeling. ADSR synthesis has limitations of producing the required number of harmonics mathematically to reach that richness or fullness of the original music note.
The other method, digital waveguide (DWG), is also used for modeling the musical instruments. Figure 2 shows the schematic diagram of DWG technique. It is used to express the musical instruments in wave form guided between two fixed points. It puts forward the concept of music note as a wave traveling between two points. Bridge and Nut are the endpoints (rigid terminations) between which the music note wave is traveling. This DWG string modeling involves non-linear distortions and its post distortion gain needs to be adjusted to develop the model. DWG appears to be challenging due to more computational burden.
Then researchers also experimented with impulse response of the musical instrument. There has been competitive research to find the impulse response of musical instrument. Researchers have used hammer method for estimating the impulse response. The experimentation to find impulse response, also involved breaking the string of guitar as we discussed earlier. Well, keeping the limitations in mind, the research started with cepstral domain approach, well-known technique for speech processing.
Every instrument is unique and this uniqueness can be demonstrated with impulse response. It’s as unique as the fingerprint of a person. The experiments involved breaking the string of the guitar to record the impulse response. Certainly, this made the author to think further to improve this uniqueness of the instrument, i.e. impulse response using cepstral domain representation. Being a guitar lover, the author chose the box shaped acoustic guitar music notes for the research work.
A simple convolution operation is involved in the generation of the music note. The excitation signal, x(t) is convolved with the impulse response of the instrument to generate the music note, y(t). Figure 3 shows the block diagram for this signal processing operation. When the music note is recorded and x(t) is known, the impulse response can be estimated and recorded. This research demonstrated the extraction of the impulse response of the music note, based on adaptive cepstral domain window (ACDW) method. Figure 4 shows the outline of the cepstral domain window approach. The impulse response based synthesis is carried out and the listening tests are conducted on the guitar players to measure the mean opinion score (MoS) of synthesized notes. Further, the machine learning algorithms are used to classify the synthesized notes for playing expression.
The outline of the chapter is as follows: the Section 2 will discuss the work by other researchers, Section 3 will discuss the methodology for the impulse response modeling while Section 4 will discuss the results in detail. The Section 5 will summarize the work and conclude the modeling work for acoustic guitar.
2. Literature review
This section reviews the modeling techniques of guitar as an instrument. The physical model of an acoustic guitar consists of three main parts: strings of guitar, the wooden sound box and the sound radiated by the soundboard. The review starts with the modeling techniques used for guitar strings, continues with the modeling techniques involving the sound box and finally the convolved signal i.e. radiated sound by the soundboard. It also covers the Neural Network (NN) and Deep Learning based classification techniques and verification of synthesized music instruments. The literature survey has been divided into: Physical or mathematical modeling and Impulse response methods.
The work by Gerald Schuller et al. in  has been used as reference for collection of acoustic guitar notes. They considered 5 plucking styles finger-style (FS), picked (PK), muted (MU), slap-thumb (ST), and slap-pluck (SP) and the 5 expression styles: normal (NO), vibrato (VI), bending (BE), harmonics (HA), and dead-note (DN) for feature extraction of plucking and expression styles of electric bass guitar. Anssi Klapuri et al.  have proposed a method for extracting the fingering configurations automatically from a recorded guitar performance. 330 different fingering configurations are considered, corresponding to different versions of the major, minor, major 7th, and minor 7th chords. Hidden Markov Model has been used.
Migneco et al.  proposed physical models for plucked string instruments that can produce high-quality tones using a computationally efficient implementation, but the estimation of model parameters through the analysis of audio remains challenging. Moreover, an accurate representation of the expressive aspects of a performance requires a separation of the performer’s articulation (source) from the instrument’s response (filter). This work explores a physically-inspired signal model for plucked guitar sounds. It facilitates the estimation of both string excitation and resonance parameters simultaneously. Julius O Smith  discussed the piano synthesis, focused on commuted synthesis. The instrument models can be treated as Linear Time Invariant systems and that’s why the commutation is possible. Commuted synthesis promotes implementation of enormous resonators inexpensively, three orders of magnitude less computation for other string instruments. The sound board and enclosure (i.e. guitar body) are commuted. It needs stored recording of their impulse responses. Otherwise it demands higher order digital filters.
Further, the work by Meng Koon Lee et al. in  talks about the physical modeling based on the interaction of the strings of the guitar with other parts of the guitar body. The researchers experimented with the sound generated by guitar with respect to soundboard and its relationship with the guitar body. The soundboard plays an increasingly important role compared to the sound hole, back plate, and the bridge at high frequencies. Design of bracings and their placements on the soundboard increase its structural stiffness as well as redistributing its deflection to non-braced regions and affecting its loudness as well as its response at low and high frequencies. The work is focused to increase the sound level with bracing designs and their placements. The analysis is being carried out for the archtop guitar.
The paper , written by Keith D Martin explains the classification technique based on physical properties. It is focused on the classification using pattern recognition. A statistical pattern-recognition technique is applied to instrument tones within a taxonomic hierarchy. The salient acoustic features related to physical properties of source excitation and resonance structure are measured from output of auditory model for 1,023 isolated tones over the full pitch ranges of 15 orchestral instruments. The data set included examples from the string (bowed and plucked), woodwind (single, double, and air reed), and brass families. Eric J. Henry et al.  proposed a model that can yield representations for the chords that require minimal prior knowledge to interpret. The model has been developed to address both challenges by modeling the physical constraints of a guitar to produce human-readable representations of music audio, i.e. guitar tablature via a deep convolutional network.
Jakob Abeßer et al.  worked on a feature-based approach for the classification of different playing techniques in bass guitar recordings. The applied audio features are chosen to capture typical instrument sounds induced by 10 different playing techniques. This work introduced a set of low-level features that allowed modeling the peculiarities of 10 different bass-related plucking and expression styles by capturing typical timbre related characteristics. The work further in  models the plectrum which is used for playing guitar notes. Here Francois Germain et al. proposed a model of the plectrum, a guitar pick, for use in physically inspired sound synthesis. The model is drawn from the mechanics of beams. The profile of the plectrum is computed in real time based on its interaction with the string, which depends on the movement impressed by the player and the equilibrium of dynamical forces. A condition for the release of the string is derived, which allows driving the digital waveguide simulating the string to the proper state at release time. The algorithm proposed by Henri Penttinen et al.  estimates the plucking point of guitar tones obtained with an under-saddle pickup. This problem is approached in the time domain by applying autocorrelation estimation. Onset detection has been improved in this proposed work. It enables a new way to control audio effect parameters in real time by simply changing the plucking point. The plucking position changes the timbre of the string’s tone, most notably the brightness. This effect is used as an expressive tool in music. By using the PPE (Plucking Point Estimation) algorithm to control an audio effect, change in the plucking position can affect the timbre even more dramatically than in the natural unprocessed case.
Gabriele Varieschi et al.  presented mathematical and physical models to be used in the analysis of the problem of intonation of musical instruments such as guitars, mandolins and similar instruments. The analysis is done by designing the fret’s placement on the fingerboard according to mathematical rules assuming an ideal string. The intonation of a string note gets affected when other string’s deformation and inharmonicity come into picture. To nullify the effects, the authors have designed some compensation procedures. V.E.Howle et al.  proposed a known tool of Eigenvalue to musical instruments. The work as its name “Eigenvalues and musical instruments” suggests is based on finding the eigenvalues of musical instruments. The instrument categories like strings, bars and drums fall under linear system’s class. It is focused on plotting the eigenvalues for different types of musical instrument giving pictorial view of change in the eigenvalues with change in different parameters like stiffness, friction or sound radiations.
Antoine Chaigne et al.  considered the end conditions of piano strings and proposed that it can be approximated by the input admittance at the bridge. A method of validation of admittance measurements on simple structures is proposed in this paper. High resolution signal analysis performed on string’s vibrations yields an estimate for the input admittance. This method is implemented on a simplified device composed of a piano string coupled to a thin steel beam.
A parametric modeling of string instruments is proposed by famous researchers, Matti Karjalainen et al. in . Parametric modeling of musical instrument sounds again helps to re-synthesize the music sounds or morph them. This type of modeling can also be used to apply the parameters in physical and perceptual studies of acoustic instruments. It is typically based on pole-zero modeling technique applied to string instrument sounds. As proposed by Julius Smith the instruments are assumed to be linear time-invariant systems while using this parametric modeling. Our research work has been directed by the same principal as that of the work by Julius Smith.
The authors in work [15, 16] talk about sound separation. It is very important for developing the equalizers to balance the sounds of different musical instruments in music events. The method based on ‘anisotropic smoothness’ indicates that the harmonic instruments are smooth in the time domain whereas the percussive instruments are smooth in the frequency domain. The authors have worked on both the types of music notes. The spectrogram highlights this smoothness in time and the frequency domain and the method is implemented under some conditions. The work reduced the computational complexity for source separation as compared with the other methods as Monte-Carlo method and large-sized matrix multiplications. The results are discussed for the acoustic guitar and piano as the harmonic instruments and the drum as the percussive instruments. This paper again helped us to understand the spectrogram approach towards the source separation.
Further with reference to impulse response research, Nelson Lee et al.  proposed a method of decomposing a plucked string instrument into modular components. The model is based on parameter estimation of excitation signal, string vibration, body resonator and finally the radiated sound pressure. As the modeling progresses it becomes clear that for reaching close to body impulse responses, the order of the filter demands a hundred of poles and zeros. Inverse filtering is used to compensate for high orders of filters.
Friedrich Türckheim et al.  used the ‘Novel Approach of Impulse Response measurement’ as starting point for modeling approaches or to investigate the relationship between transfer functions and the instruments’ quality. This is done usually for the experimental determination of transfer function as the complete and reliable physical models are still to be developed. The impulse responses here have been compared with the commonly used impact hammer method. The work proposed in above two references have limitations of filter orders and determination of exact transfer functions. So the author thought of different approach for the calculation of impulse response of the guitar body.
Methodology adopted by the author is different than other researchers. So this section is the summary of the research work carried out for impulse response modeling of acoustic guitar. The sections 3.1 and 3.2 discuss the structure of acoustic guitar body and the collection of music notes for two different plucking styles and plucking expressions. Section 3.3 describes how the
3.1 Structure of acoustic guitar
Let us understand the structure of the box shaped acoustic guitar. It has a resonance cavity, shaped like a butterfly. The wings are short at nut side and are bigger at the saddle side. As shown in Figure 5, there are six strings on the acoustic guitar. The strings are tied between the saddle and the nut of guitar on the fingerboard. The fingerboard is also known as fretboard.
The fingerboard of guitar consists of 19 to 21 frets. The frets are the metal marks on the fretboard, arranged in logarithmic scale. They are shown with x1, x2, x3, …x19 here. The acoustic guitar model, FAW 802 is chosen for the research work. The music notes were recorded in an acoustic studio. The acoustic guitar chosen for this research work has twenty-one frets and six strings. Music notes on twenty frets are considered for the analysis and synthesis purpose.
3.2 Collection of guitar notes and database generation
The music note is recorded for each fret of each string (except the 21st fret). The guitar notes are collected based on: plucking style and the plucking expression. The plucking style indicates the object used for plucking the string of the acoustic Guitar. The plucking expression indicates whether the note is played by a naïve (beginner) person or the expert person. The two players, one naïve and the other, expert, are recorded for two plucking styles. The Figure 6 gives overview of the collection strategy of the Guitar notes.
When the string is plucked by finger, it will generate a music note and it is named as ‘
The Figure 7 shows the frequency values for all strings and their frets. The frets are 1F to 20F. The first column gives the string numbers along with their names: string E, string B, string G, string D, string A, string E. The column, ‘OPEN’ gives the open string frequencies with 82 Hz as the lowest frequency value and 329 Hz as the highest open string frequency value. The frequency for fretted notes varies from 87 Hz for string 6 fret 1 to 1047 Hz for string 1 fret 20. This is the maximum frequency of the guitar note. All guitar notes are therefore recorded with 16 kHz sampling frequency in ‘.wav’ format. Software for sound analysis, named, ‘Audacity’ is used for noise removal of the guitar notes.
Figure 8 shows the scatter plot of the frequencies of all frets of all strings. The numbers 1 to 21 on the x-axis represent the fret numbers of the strings. The y-axis depicts the frequencies of guitar notes. This scatter plot helps to understand the mathematical relationship as well as the minimum and the maximum frequency values for these notes. The maximum frequency for the notes is 2 kHz so the sampling frequency of 16 kHz is selected for recording of the music notes in acoustic studio.
The string E with 329 Hz frequency is string 1, string B with 247 Hz frequency is string 2, string G with 196 Hz frequency is string 3, string D with 147 Hz frequency is string 4, string A with 110 Hz frequency is string 5 and the string E with 82 Hz frequency is string 6. The strings are mentioned by the numbers (like string 1, string 2 …) in further discussion of research work. Total 504 sound notes are recorded including two plucking styles and plucking expressions. Another set of 504 notes is recorded for string modeling. The dataset generated has been published on Mendeley Repository, Elsevier.
3.3 Pythagoras fractions OR rule of 18 (
Pythagoras fractions or Pythagorean tuning system is developed to study frequency ratios of all intervals which are based on the ratio of 3:2. It is the rule, given by Eq. (1), which indicates the mathematical relationship of all the music notes and is used for checking the frequency of generated notes. Here the ‘
This will create a table for 20 frets of each string. Table 1 shows the sample calculations of 6 frets for all the six strings from the open string frequencies. In this table, after knowing the open string frequency, all the other frequencies are calculated by using the second column which gives the octave multiplier factor. First column gives the fret number, ‘
Consider the sample values in cells which are highlighted in orange. The calculation for the frequency for string 1 fret 1 is done as:
e.g. 329 Hz * 1.059463 = 349.199 Hz
3.4 Impulse response modeling
The frequency analysis done in above sections helped to develop better understanding of the fullness of the music notes based on their number of harmonics. This is also helpful to get better understanding of the playing style and plucking expressions. The frequency analysis is now followed by the impulse response estimation. The next five subsections deal with the synthesis part of the research work and discusses the algorithmic approach towards the impulse response modeling.
The cepstral domain approach is frequently used for speech signal processing but it is not so far used for music signal processing. This method is used for separation of vocal tract response and the excitation signal in speech signal processing. Based on the same principle, this modeling work is focused on separation of: 1) impulse response or body response from 2) excitation signal of the acoustic guitar notes.
The next section discusses the algorithm for Cepstral Domain Windowing (CDW) and its application for modeling of acoustic guitar notes using the same CDW method.
3.4.2 Cepstral domain window method
Let us focus on theoretical aspects of cepstral domain. Figure 9 shows the block schematic of the cepstral domain method used for speech analysis. The input to the system is speech signal. The speech signal consists of excitation signal convolved with the impulse response of the vocal tract. On similar principle, the music note is given as input the system. It gives the representative picture of impulse response and the excitation signal, characteristically separated. The excitation signal is periodic in nature and the impulse response is the slowly varying function. The signal is passed through a smooth window function, a Hamming window function and the spectrum is plotted by calculating the FFT of the block. When the logarithm of the magnitude of FFT output is calculated, the periodic excitation signal is clearly seen as rapidly varying function and the vocal tract response appears as the slowly varying function. By using the cepstral domain window, isolation of excitation signal from body response is possible. Thus cepstral domain method can be used for modeling of music note of guitar.
3.4.3 Cepstral domain windowing method for acoustic guitar notes
The block schematic of cepstral domain windowing method for analysis of acoustic guitar music notes is given in Figure 10. The algorithm for CDW method discussed in Section 3.4.2 is used for calculation of body response or impulse response of the guitar box. The input to the system is the acoustic guitar note. The FFT block gives spectrum of the music signal and then the complex logarithm of magnitude of FFT output is taken. The periodic excitation is seen as a rapidly varying function and guitar body response appears as the envelope of the spectrum. The body response is a slowly varying function. After the IFFT of the signal is calculated it enters in the cepstral domain. The cepstral domain plot indicates a cluster near the origin that represents the body response and the periodic peaks after the cluster represent the periodic excitation generated due to plucking of the string by hand or by plectrum.
The string1 fret 2 note is taken as sample input to the different blocks in Figure 10. Figure 11 shows the time domain representation of this acoustic guitar note used as input to the system for isolation of body response and the excitation signal for string 1 fret 2 with finger plucking style. The sampling rate for the recorded note is 16 kHz.
3.4.4 Synthesis of guitar note using isolated body response and the excitation signal
The body responses and the excitation signals are calculated for 252 guitar notes including the plucking style: finger and the plectrum plucked music notes. A note is then synthesized by convolving the estimated body response and the isolated excitation signal. The results are verified using the correlation coefficients and it is observed that the constant length window poses some limitations to give highly correlated synthesized guitar note. Figure 12 shows the plots for the original guitar note and the synthesized guitar note. But the results are not satisfactory because of the low correlation coefficient values. Table 2 presents the sample values of 6 frets of string 1 music notes.
|String 1||Correlation coefficient||MoS Score based on (0–1) scale|
The synthesis results are improved by changing the length of the window. This ‘
This section covers the discussion of the ACDW method along with the results of impulse response estimation and the synthesis of guitar notes. The length of the window in cepstral domain is changed in the range of 50 samples to 300 samples. The estimation of the best impulse response is done based on correlation coefficient. The correlation coefficient is the statistical parameter to indicate the degree of similarity. A lot of experimentation is done by varying the number of samples of the cepstral domain window. It is observed that the correlation coefficient drops when the number of samples in cepstral domain window are increased further. The range is finalized after studying the impulse response and the synthesis results.
Figure 13 plots the correlation coefficients versus the number of samples in the cepstral domain for the string 1 fret 1 finger plucked Guitar note. From the figure, it’s clear that when number of samples in the chosen length of cepstral window is 70, ACDW synthesis gives highest correlation coefficient. The graph shows the decaying nature of the impulse response for the selected guitar note. So the improvement in the synthesis is achieved with the help of ACDW method. Once this is achieved, the extracted impulse responses are analyzed further to observe their relationship. The isolated impulse responses for all the frets of a single string are plotted and it’s observed that these impulses are also following the important
This triggered the thought of using the impulse response of a single fret to generate the impulse responses for the other frets. The experimentation is carried out for string 1 with all its 20 frets for impulse generation. The generated impulses were convolved with the separated excitation signals to generate all the music notes along a single fret. This gives rise to generalized acoustic guitar model where a single impulse response can be stored and used to generate all other music notes of that guitar. Figure 14 shows the time domain graph of the impulse responses showcasing their octave relationship. It demonstrates the
In summary, Figure 15 shows the model for the
4. Discussion of results
The synthesis results for the fixed sized windows is poor, given in Table 2, as verified with MoS (Mean Opinion Score),
4.1 Synthesis results of the generated impulse response and excitation signal
The separated excitation signal and the estimated impulse response are convolved together to synthesize the acoustic Guitar note. The ACDW approach provides good scope for better synthesis. The synthesis has been carried out for the two playing styles, namely, finger plucked, and pick plucked notes. The highest correlation coefficient value is 0.98 for finger plucked guitar note. The MoS (Mean Opinion Score) is also best indication for the synthesized guitar note giving highest value as 0.95.
The NN and machine learning algorithms are used to classify the plucking style and plucking expression for the original recorded notes. Once the model is trained it is further deployed for cross-validation of the synthesized music notes based on the ACDW method. The results are discussed subsection 4.1.
The contribution of this impulse response modeling work is to isolate the body response and the excitation signal of acoustic guitar notes. An
To summarize the work, impulse response modeling is implemented with good accuracy. Further, the neural network (NN) is used for classification of naïve and expert player considering the expression in the note played. The classification is also done for plucking style i.e. finger plucking and plectrum plucking. The results of the two methods of synthesis i.e. ACDW and Generalized model are cross verified by NN model. The trained model of the classification is used for verification of the synthesized notes.
4.2 Discussion of the cross-validation results
The synthesis of the acoustic guitar notes is implemented using the ACDW method and a generalized model is developed using the body response of the single fret. The validation of the synthesized guitar notes is done by the subjective (listening) tests and the correlation coefficient values. NN model is used for the classification of guitar notes with respect to plucking style and plucking expression. Further this trained model is used for testing the synthesized guitar notes to identify plucking style and plucking expression. This is named as cross validation of the models.
Table 3 summarizes the validation of the synthesized guitar notes based on: 1) Mean opinion Score i.e. MOS as the subjective tests and 2) Correlation coefficients as the statistical parameter for finding the similarity between original and synthesized music notes using ACDW approach. The NN model is used for classification of the music notes based on: 1) plucking style and 2) plucking expressions.
Table 4 summarizes the cross-validation result for synthesized music notes, only of an expert player. The last two columns are highlighted to show the result of NN modeling to predict the class. The table values confirm the model validation as the classification results are greater than 80%. Only few sample results are shown in Table below.
|1||Finger pluck||Open string||Fret 1||Fret 2||Fret 3||Fret 4||Fret 5||Fret 6|
|2||Number of samples in the window||70||60||60||50||50||50||50|
|4||MOS score on the scale of(0–1)||0.95||0.95||0.95||0.9||0.8||0.8||0.9|
|5||Pick pluck||Open string||Fret 1||Fret 2||Fret 3||Fret 4||Fret 5||Fret 6|
|6||Number of samples in the window||50||190||100||170||300||60||60|
|8||MOS score on the scale of (0–1)||0.9||0.9||0.9||0.9||0.9||0.9||0.9|
Similarly, a cross-validation has been carried out for the plucking style where the acoustic guitar notes played by the Expert Player have been passed to the trained model and the plucking style is predicted. The results from the Impulse Response modeling method are considered for the identification of the plucking style. Table 5 summarizes the results of the trained model for identification of plucking style of synthesized notes using NN classifiers. The cells highlighted with yellow indicate the wrong classification of the plucking style. The second column gives the names of the music notes while the 3rd and the 4th columns indicate the correlation coefficients for the subjective tests and the impulse response method.
The limitation of the impulse response method using the hammer method and string-breaking method are overcome with the help of cepstral domain window method. The challenges of isolation of impulse response from the excitation signal are overcome using ACDW approach and a model is developed using the body features. The main contribution of the present research work is: 1) Physical model for Guitar as an instrument using Adaptive Cepstral Domain Window (ACDW) approach, 2) Generalized Model for Impulse Response of Acoustic Guitar for All Frets using a Response of Single Fret, and 3) Classification of guitar notes based on plucking style and plucking expression. The validation of the synthesized notes is done by using subjective listening tests i.e. Mean Opinion Score (MOS) and the correlation coefficients. The classification of plucking style and plucking expression is done using NN modeling techniques. The trained model is used for testing the plucking expression of the synthesized model. This model can be used to certify if the player is becoming an expert. If the score for expert identification is greater than 95% then player can be certified as expert.