Open access peer-reviewed chapter

A Case Study of Wavelets and SVM Application in Coffee Agriculture: Detecting Cicadas Based on Their Acoustic and Image Patterns

Written By

João Paulo Lemos Escola, Rodrigo Capobianco Guido, Alexandre Moraes Cardoso, Douglas Henrique Bottura Maccagnan, João Marcelo Ribeiro and José Ricardo Ferreira Cardoso

Reviewed: October 15th, 2019 Published: December 16th, 2019

DOI: 10.5772/intechopen.90156

Chapter metrics overview

593 Chapter Downloads

View Full Metrics


One of the main problems in agriculture is crop pest management, which causes financial damage to farmers. This management is traditionally performed with pesticides; however, with a large area of application, it would be more economically viable and more environmentally recommended to know precisely the regions where there is concrete infestation. In coffee farms, cicada makes a distinctive sound when it hatches after years of underground nymph-shaped living. One possibility of contributing to its management would be the development of a device capable of capturing the sound of the adult cicada in order to detect its presence and to quantify crop insects. This device would be spread across the coffee plots to capture sounds within the widest possible area coverage. With monitoring and quantification data, the manager would have more input for decision-making and could adopt the most appropriate management technique based on concrete information on population density separated by crop region. Thus, this chapter presents an algorithm based on wavelets and support vector machines (SVMs), to detect acoustic patterns in plantations, advising on the presence of cicadas.


  • acoustic patterns
  • support vector machines
  • wavelets
  • digital signal processing
  • cicada

1. Introduction

Since human population has been intensively increasing, the need for food and other products from agricultural fields also grows. To guarantee the production, monoculture is carried out at extensive areas, which intensifies the appearance of pests. Generally, pest control in agriculture is performed with the use of chemical pesticides, often applied even in areas without pest incidence, which raises the cost of production and may cause environmental impacts that affect human health. The development of specific hardware and software for pest detection in agriculture can provide support for distinct production forms which reduce negative impacts. In this sense, the capture, recording, and analysis of acoustic signals emitted by insects can be an alternative to optimize the production of certain crops [1].

The cicada (Hemiptera: Cicadidae) is a good example of an insect capable of emitting acoustic signals. In Brazil, the coffee plantation can be attacked by several arthropods and among them, Quesada gigas is considered a key pest in the entire state of Minas Gerais and in the northeastern region of the state of São Paulo [2]. Considering that there have been reports on the occurrence of cicadas in coffee plantations from the period between 1900 and 1904, it is important to say that this fact has interfered in the way of conducting the crop: it has practically forced coffee growers to adopt practices for productive system improvement, such as larger spaces between plants, allowing for the mechanization of crop aiming at the use of pesticides to manage this and other pests and diseases that affect productivity.

Cicadas attack crops in the search for sap, their main food. The impacts on the plants occur in the nymphal phase of the cicada when it sucks sap from the host plant root [3]. The Q. gigas species, that are the biggest in size the country, can reach 70 mm in length, including wings, and 20 mm in width in the case of males. The females can reach 69 mm in total length and 16.5 mm in width, as shown in Figure 1. The size of the insect is probably the reason why it is associated with the impact caused in coffee plantation.

Figure 1.

Quesada gigas. On the left, male emitting acoustic signals. On the right, lateral view of male resting.

Males usually sound from October to December. In the 1970s, since there was no efficient method for cicadas controlling, many coffee growers had no choice than to eradicate infested crops. Many of them even abandoned their cultivation. However, lately the recommended control has been made through systemic chemical defenses and more recently by the use of a sound trap that attracted Q. gigas to a closed spraying system [4].

There are few technological devices used in coffee plantations to monitor and control the cicadas, and maybe, because of this, the most used device is the chemical one. Considering that the methods currently used for mapping and monitoring the populations of cicadas consist basically of the nonautomated counting of the individuals through direct observation, this paper aims to present some applications using digital signal processing and support vector machines (SVMs) as techniques for detecting and monitoring cicadas in crops and forests, reducing time and control costs.


2. Initial considerations

Machine learning is a branch of artificial intelligence that seeks to develop algorithms capable of learning certain behaviors or patterns through examples, being able to generalize from training. SVMs, for instance, are supervised machine learning methods with superior results compared to other pattern classification procedures, considering binary problems [5].

Thus, one possibility for improving agricultural procedures would be the development of a device capable of capturing the sound of the adult cicada in order to detect its presence, thus monitoring and quantifying crop insects. This device would be spread across the coffee fields to capture sounds within the widest possible area coverage. That allows for the manager to have more input for decision-making, adopting the most appropriate management technique based on concrete information on population density separated by crop region.

Cicadas emit particular frequency components which characterize certain patterns. To detect them, both Fourier and discrete wavelet (DWT) transforms, which convert a time-domain signal to the frequency domain, can be used. Nevertheless, in the latter case, it is still possible to obtain the time support of frequencies [6]. Then, an SVM can be used to refine the results, pointing out the existence of the important patterns from the wavelet-transformed signals. This is just how the proposed approach was implemented. Related works, such as [1, 7, 8, 9, 10, 11, 12, 13, 14, 15], perform automated data collection for monitoring and corroborate the present work.


3. Application one (AP1): cicada density estimation by audio processing

This solution consists of a system that, from an input wav audio file, discriminates between three possibilities, noise, low density, and high density, assisting in monitoring the cicada infestation in the coffee crop. Based on features from the human auditory system, which easily differentiates between these acoustic patterns, the system proves to be efficient. Working similarly to the cochlea in the human ear [16] and based on the DWT packet, an efficient time-frequency mapping [6] is provided.

Based on our assumptions and intending to assess it, we carried out the following preprocessing procedure to convert each acoustic input signal of variable length into a 25-sample long feature vector:



    • STEP 1: the raw data from the input signal i, recorded as a wav file sampled at 44,100 samples/s, 16-bit [16], is extracted and stored as the vector si, for 0iX1. Each original wav file lasts about 3 s, i.e., 441003=132300 samples, with variations among them. Since a wavelet-based transformation is used in the next step, we cut the the vectors si taking advantage of their central part in such a way that their length became 262,144 samples, which is a power of two;

    • STEP 2: si of size 262,144 is converted into its corresponding feature vector, i.e., fi of size X=25, where 0iX1. Particularly, fij, i.e., the jth component of fi where 0j24, corresponds to the normalized energy of the jth Bark scale band of the wavelet-packet transformed input signal si at the maximum decomposition level, as in Eq. 1, considering the natural frequency ordering [6], according to Table 1. Comments on the wavelet family used are presented ahead.

    • END.

BarkaBand range (Hz Hz)Initial WPT sampleFinal WPT sampleEnergy rangeb

Table 1.

The 25 sets of samples, from the 18-th level WPT of size 262,144, is used to mimic the bark scale.

Bark band.

Range in fi[Bark band] Hz.

At that level, which is the maximum, and considering the original signal sampling rate of 44,100 samples/s, each sample of the transformed signal has a resolution of 44,100218+1=0.0841Hz. The energy of each one of the 25 sets is calculated separately and, then, normalized based on its division by the total energy. Thus, fij is the jth normalized energy, (0j24), for a certain input signal si. The symbol ⌊⋅⌋ in the fourth column of the table represents a rounding floor operation. It is required because a sample is obviously always an integer number. Due to that rounding, the frequency range we obtained from the WPT tree is only an approximation to the Bark scale; however, it does not make any difference in practice.


We apply here the technique described in [17], which is based on paraconsistent logic, to analyze the behavior and suitability from the obtained data, i.e., the feature vector. Thus, the data vectors were represented in the paraconsistent plane as point P, according to Figure 2. Ideally, the closer the P is from corner G1G2=01, the better our feature vectors separate between the classes, disregarding any specific classifier. The next step was to choose the best wavelet family, that is, the family that puts P closer to that corner.

Figure 2.

The paraconsistent plane where the axes G1 and G2 represent the degrees of certainty and contradiction, respectively. P=G1G2=αβα+β1, drawn in blue just to exemplify, is an important element for our analysis: The closer it is to the corner (1,0), the weaker the classifier associated with the features vector can be. The values of α and β are derived from intra-class and inter-class analyses, respectively, as detailed in [17].

Proceeding, we estimate that an SVM could be a proper classifier to interpret the feature vectors we selected during the preprocessing step, because of its excellence in terms of binary classifications [5]. Accordingly, Figure 3 shows the complete setup for the proposed approach. It is divided into two phases, i.e., training and testing, with four steps each. As discussed ahead, a total of X and Y vectors were isolated to carry out each phase, respectively, where X+Y corresponds to the number of acoustic files in the database.

Figure 3.

The experimental setup for the proposed application one.

The SVM has been implemented, as described in [5], in such a way that it receives the input vectors defined in the preprocessing step. In Figure 3, we illustrate the proposed setup, which is divided into two phases, training and testing, with four steps each. As we will discuss later, a total of X and Y vectors have been isolated to perform each phase, respectively, where X+Y corresponds to the number of acoustic files in the database.

The detailed procedures are as follows:



    • All the X vectors fi were used to train an SVM with 25 input passive elements, X hidden active non-linear elements and one output active linear element, as in Figure 4. X elements were used in the hidden layer to allow for a simple and effective training scheme, as explained in [5]. The kth element in the hidden layer uses a function of the form

Figure 4.

The SVM structure used in application one approach. The weights determined during the supervised part of the training are {w0, w1, ..., wX1}. The output element linearly combines the outputs of the hidden layer with the weights.


  • implying that the kth element outputs 1 for the kth training vector and a value in the range (0–1) for the others, where 0kX1. This corresponds to a non-supervised procedure. There is no weight between the input and the hidden layers, however, there are X between the hidden and the output layer. To find them, a linear system of X equations in X unknowns is established and solved, implying in a supervised task. In that system, the closest resultant value from the SVM set corresponds the answer.

  • END.

Once the training procedures are over, the system is ready for testing, as follows.



    • Each one of the Y testing vectors of size 25 are passed through the trained SVM and the corresponding output is verified: SVM with result closest to zero will be elected;

    • END.


4. Application two (AP2): cicada density estimation by image processing

Visually, cicadas are quite noticeable in the farming environment, so a management hypothesis would be the inclusion of a camera to permit visual detection of pests, adjusting the data capture interval for sending to a web server.

The tests were performed by using images captured in the coffee crop and divided into three classes that represent the pest insidious density: high, medium, and none. The submitted images have been converted to grayscale and have a size of 320 × 240 pixels, as in Figure 5.

Figure 5.

Examples of images used: high, low, and zero density, respectively.

Similar to AP1, an SVM is used to classify the preprocessed input energies; however, in this case we use the N normalized energies of each wavelet sub-band, instead of 25 Bark bands, according to the decomposition level defined in each test instance. In Figure 6, we present a flowchart that illustrates the process of capturing and processing the images.

Figure 6.

The experimental setup for the proposed application two.

In Figure 7, we have the SVM illustration with n=4level entries, corresponding to the number of energies of the selected level.

Figure 7.

The SVM structure used in application two. Similar to AP1, the weights determined during the supervised part of the training are {w0, w1, …, wX1}. The output element linearly combines the outputs of the hidden layer with the weights.


5. Tests and results

The implementation of the algorithms proposed here were performed using Java programming language.

5.1 AP1

Thirty-five files from each class were collected and used. The tests were performed with those files not used for training, where two to five training files were experimented. For each set of training files, the DWT maximum level and mean decomposition level were tested with each of the 46 wavelet filters presented in Table 2. The best result, i.e., 96.88% accuracy, was obtained with Haar filter, showing excellent results and confirming our hypothesis of viability of using this system in coffee crop.

WaveletTrain files/classTest filesPercentage

Table 2.

Test results from AP1.

The cross-validation procedure was performed to present the best result in Table 2. The algorithm developed for cross-validation is illustrated in Figure 8.

Figure 8.

Cross-validation algorithm.

5.2 AP2

To develop AP2, an application that uses digital image processing to estimate cicada density in a coffee crop was adopted, where 35 high-density class files, 35 low-density class files, and 35 files considered by an expert as zero density were used. In Tables 3 and 4, we present the results of the tests performed in the laboratory, which demonstrate the viability of future implantation in that system.

WaveletLevelTrain files/classTest filesPercentage

Table 3.

Test results from AP2.


6. Conclusions

Both systems are being implemented in hardware for real-time coffee crop deployment using ESP8266 devices and their derivatives, integrating with cloud server for storing and organizing data to aid farmer decision-making. In future work, we must present practical results of their implementations.

High densityLow densityNull densityTotal
High density303033
Low density131133
Null density003333

Table 4.

Best confusion matrix from AP2.

Both AP1 and AP2 are systems that can be used as an additional form of coffee crop pest control and management. However, one possibility to be studied by the present research group is the combined use of the modalities in a single system in order to obtain even more improved results.

An important feature that was used for the laboratory experiments was the Java [18] Serialization class, which allowed for the wavelet transform in both AP1 and AP2 to be performed only once, storing its result on disk, making it possible to recover its value in repetitive cross-validation testing, substantially reducing equipment processing time.


  1. 1. Mankin RW et al. Perspective and promise: A century of insect acoustic detection and monitoring. American Entomologist. 2011;57(1):30-44
  2. 2. Martinelli NM, Zucchi RA. Cicadas (Hemiptera: Cicadidae: Tibicinidae) associated with coffee: Distribution, hosts and key to species (in Portuguese). Anais da Sociedade Entomolgica do Brasil. 1997:133-143
  3. 3. De Souza JC. Coffee Cicada in Minas Gerais: Historical, Reconnaissance, Biology, Damage and Control (in Portuguese). Belo Horizonte: EPAMIG; 2007
  4. 4. Maccagnan DHB. Cicada (Hemiptera: Cicadidae): Emergence, Acoustic Behavior and Sound Trap Development (in Portuguese) [Tese de Doutorado. PhD thesis]. Faculdade de Filosofia, Ciências e Letras da Universidade de São Paulo; 2008
  5. 5. Haykin S. Neural Networks and Learning Machines. HAYKIN, Simon. 3/E. India: Pearson Education; 2010
  6. 6. Guido RC. Effectively interpreting discrete wavelet transformed signals. IEEE Signal Processing Magazine. 2017;34(3):89-100
  7. 7. Dawson DK, Efford MG. Bird population density estimated from acoustic signals. Journal of Applied Ecology. 2009;46(6):1201-1209
  8. 8. Eliopoulos PA, Potamitis I, Kontodimas DC. Estimation of population density of stored grain pests via bioacoustic detection. Crop Protection. 2016;85:71-78
  9. 9. Eliopoulos PA et al. Detection of adult beetles inside the stored wheat mass based on their acoustic emissions. Journal of Economic Entomology. 2015;108(6):2808-2814
  10. 10. Marques TA et al. Estimating animal population density using passive acoustics. Biological Reviews. 2013;88(2):287-309
  11. 11. Gardiner T, Hill J. A comparison of three sampling techniques used to estimate the population density and assemblage diversity of Orthoptera. Journal of Orthoptera Research. 2006:45-51
  12. 12. Langer F et al. Geometrical stem detection from image data for precision agriculture. 2018. arXiv preprint arXiv:1812.05415
  13. 13. Burgos-Artizzu XP et al. Real-time image processing for crop/weed discrimination in maize fields. Computers and Electronics in Agriculture. 2011;75(2):337-346
  14. 14. Li Y et al. In-field cotton detection via region-based semantic image segmentation. Computers and Electronics in Agriculture. 2016;127:475-486
  15. 15. Burgos-Artizzu XP et al. Improving weed pressure assessment using digital images from an experience-based reasoning approach. Computers and Electronics in Agriculture. 2009;65(2):176-185
  16. 16. Bossi M, Goldberg E. Introduction to Digital Audio Coding and Standards.Springer Science & Business Media; 2012
  17. 17. Guido RC. Paraconsistent feature engineering. IEEE Signal Processing Magazine. 2019;36(1):154-158
  18. 18. Haverlock K. Object serialization, Java, and C++. Dr. Dobb’s Journal: Software Tools for the Professional Programmer. 1998;23(8):32-35

Written By

João Paulo Lemos Escola, Rodrigo Capobianco Guido, Alexandre Moraes Cardoso, Douglas Henrique Bottura Maccagnan, João Marcelo Ribeiro and José Ricardo Ferreira Cardoso

Reviewed: October 15th, 2019 Published: December 16th, 2019