A Case Study of Wavelets and SVM Application in Coffee Agriculture: Detecting Cicadas Based on their Acoustic and Image Patterns

One of the main problems in agriculture is crop pest management, which causes financial damage to farmers. This management is traditionally performed with pesticides; however, with a large area of application, it would be more economically viable and more environmentally recommended to know precisely the regions where there is concrete infestation. In coffee farms, cicada makes a distinctive sound when it hatches after years of underground nymph-shaped living. One possibility of contributing to its management would be the development of a device capable of capturing the sound of the adult cicada in order to detect its presence and to quantify crop insects. This device would be spread across the coffee plots to capture sounds within the widest possible area coverage. With monitoring and quantification data, the manager would have more input for decision-making and could adopt the most appropriate management technique based on concrete information on population density separated by crop region. Thus, this chapter presents an algorithm based on wavelets and support vector machines (SVMs), to detect acoustic patterns in plantations, advising on the presence of cicadas.


Introduction
Since human population has been intensively increasing, the need for food and other products from agricultural fields also grows. To guarantee the production, monoculture is carried out at extensive areas, which intensifies the appearance of pests. Generally, pest control in agriculture is performed with the use of chemical pesticides, often applied even in areas without pest incidence, which raises the cost of production and may cause environmental impacts that affect human health.
The development of specific hardware and software for pest detection in agriculture can provide support for distinct production forms which reduce negative impacts. In this sense, the capture, recording, and analysis of acoustic signals emitted by insects can be an alternative to optimize the production of certain crops [1].
The cicada (Hemiptera: Cicadidae) is a good example of an insect capable of emitting acoustic signals. In Brazil, the coffee plantation can be attacked by several arthropods and among them, Quesada gigas is considered a key pest in the entire state of Minas Gerais and in the northeastern region of the state of São Paulo [2]. Considering that there have been reports on the occurrence of cicadas in coffee plantations from the period between 1900 and 1904, it is important to say that this fact has interfered in the way of conducting the crop: it has practically forced coffee growers to adopt practices for productive system improvement, such as larger spaces between plants, allowing for the mechanization of crop aiming at the use of pesticides to manage this and other pests and diseases that affect productivity.
Cicadas attack crops in the search for sap, their main food. The impacts on the plants occur in the nymphal phase of the cicada when it sucks sap from the host plant root [3]. The Q. gigas species, that are the biggest in size the country, can reach 70 mm in length, including wings, and 20 mm in width in the case of males. The females can reach 69 mm in total length and 16.5 mm in width, as shown in Figure 1. The size of the insect is probably the reason why it is associated with the impact caused in coffee plantation.
Males usually sound from October to December. In the 1970s, since there was no efficient method for cicadas controlling, many coffee growers had no choice than to eradicate infested crops. Many of them even abandoned their cultivation. However, lately the recommended control has been made through systemic chemical defenses and more recently by the use of a sound trap that attracted Q. gigas to a closed spraying system [4].
There are few technological devices used in coffee plantations to monitor and control the cicadas, and maybe, because of this, the most used device is the chemical one. Considering that the methods currently used for mapping and monitoring the populations of cicadas consist basically of the nonautomated counting of the individuals through direct observation, this paper aims to present some applications using digital signal processing and support vector machines (SVMs) as techniques for detecting and monitoring cicadas in crops and forests, reducing time and control costs.

Initial considerations
Machine learning is a branch of artificial intelligence that seeks to develop algorithms capable of learning certain behaviors or patterns through examples, being able to generalize from training. SVMs, for instance, are supervised machine learning methods with superior results compared to other pattern classification procedures, considering binary problems [5].
Thus, one possibility for improving agricultural procedures would be the development of a device capable of capturing the sound of the adult cicada in order to detect its presence, thus monitoring and quantifying crop insects. This device would be spread across the coffee fields to capture sounds within the widest possible area coverage. That allows for the manager to have more input for decisionmaking, adopting the most appropriate management technique based on concrete information on population density separated by crop region.
Cicadas emit particular frequency components which characterize certain patterns. To detect them, both Fourier and discrete wavelet (DWT) transforms, which convert a time-domain signal to the frequency domain, can be used. Nevertheless, in the latter case, it is still possible to obtain the time support of frequencies [6]. Then, an SVM can be used to refine the results, pointing out the existence of the important patterns from the wavelet-transformed signals. This is just how the proposed approach was implemented. Related works, such as [1,[9][10][11][12][13][15][16][17][18], perform automated data collection for monitoring and corroborate the present work.

Application one (AP1): cicada density estimation by audio processing
This solution consists of a system that, from an input wav audio file, discriminates between three possibilities, noise, low density, and high density, assisting in monitoring the cicada infestation in the coffee crop. Based on features from the human auditory system, which easily differentiates between these acoustic patterns, the system proves to be efficient. Working similarly to the cochlea in the human ear [7] and based on the DWT packet, an efficient time-frequency mapping [6] is provided.
Based on our assumptions and intending to assess it, we carried out the following preprocessing procedure to convert each acoustic input signal of variable length into a 25-sample long feature vector: • AP1 PRE-PROCESSING PHASE: • BEGINNING.
• STEP 1: the raw data from the input signal i, recorded as a wav file sampled at 44,100 samples/s, 16-bit [7], is extracted and stored as the vector s i Á ½ , for 0 ⩽ i ⩽ X À 1 ð Þ . Each original wav file lasts about 3 s, i.e., 44100 Á 3 ¼ 132300 samples, with variations among them. Since a wavelet-based transformation is used in the next step, we cut the the vectors s i Á ½ taking advantage of their central part in such a way that their length became 262,144 samples, which is a power of two; • STEP 2: s i Á ½ of size 262,144 is converted into its corresponding feature vector, i.e., Þ , corresponds to the normalized energy of the jth Bark scale band of the wavelet-packet transformed input signal s i Á ½ at the maximum decomposition level, as in Eq. 1, considering the natural frequency ordering [6], according to Table 1. Comments on the wavelet family used are presented ahead.
We apply here the technique described in [8], which is based on paraconsistent logic, to analyze the behavior and suitability from the obtained data, i.e., the feature At that level, which is the maximum, and considering the original signal sampling rate of 44,100 samples/s, each sample of the transformed signal has a resolution of 44, 100 2 18þ1 ¼ 0:0841 Hz. The energy of each one of the 25 sets is calculated separately and, then, normalized based on its division by the total energy. Thus, f i j ½ is the jth normalized energy, (0 ⩽ j ⩽ 24), for a certain input signal s i Á ½ . The symbol ⌊Á⌋ in the fourth column of the table represents a rounding floor operation. It is required because a sample is obviously always an integer number. Due to that rounding, the frequency range we obtained from the WPT tree is only an approximation to the Bark scale; however, it does not make any difference in practice. vector. Thus, the data vectors were represented in the paraconsistent plane as point P, according to Figure 2. Ideally, the closer the P is from corner G 1 , G 2 ð Þ¼ 0, 1 ð Þ, the better our feature vectors separate between the classes, disregarding any specific classifier. The next step was to choose the best wavelet family, that is, the family that puts P closer to that corner.

Figure 2.
The paraconsistent plane where the axes G 1 and G 2 represent the degrees of certainty and contradiction, respectively. P ¼ G 1 , G 2 ð Þ¼ α À β, α þ β À 1 ð Þ , drawn in blue just to exemplify, is an important element for our analysis: The closer it is to the corner (1,0), the weaker the classifier associated with the features vector can be. The values of α and β are derived from intra-class and inter-class analyses, respectively, as detailed in [8].  Proceeding, we estimate that an SVM could be a proper classifier to interpret the feature vectors we selected during the preprocessing step, because of its excellence in terms of binary classifications [5]. Accordingly, Figure 3 shows the complete setup for the proposed approach. It is divided into two phases, i.e., training and testing, with four steps each. As discussed ahead, a total of X and Y vectors were isolated to carry out each phase, respectively, where X þ Y corresponds to the number of acoustic files in the database.
The SVM has been implemented, as described in [5], in such a way that it receives the input vectors defined in the preprocessing step. In Figure 3, we illustrate the proposed setup, which is divided into two phases, training and testing, with four steps each. As we will discuss later, a total of X and Y vectors have been isolated to perform each phase, respectively, where X þ Y corresponds to the number of acoustic files in the database.
The detailed procedures are as follows: • AP1 TRAINING PHASE: • BEGINNING • All the X vectors f i Á ½ were used to train an SVM with 25 input passive elements, X hidden active non-linear elements and one output active linear element, as in Figure 4. X elements were used in the hidden layer to allow for a simple and effective training scheme, as explained in [5]. The kth element in the hidden layer uses a function of the form e ÀEuclidian_Distance kÀth_training_vector,input_vector_under_analysis ð Þ , • implying that the kth element outputs 1 for the kth training vector and a value in the range (0-1) for the others, where 0 ⩽ k ⩽ X À 1 ð Þ . This corresponds to a non-supervised procedure. There is no weight between the input and the hidden layers, however, there are X between the hidden and the output layer. To find them, a linear system of X equations in X unknowns is established and solved, implying in a supervised task. In that system, the closest resultant value from the SVM set corresponds the answer.
Once the training procedures are over, the system is ready for testing, as follows.
• Each one of the Y testing vectors of size 25 are passed through the trained SVM and the corresponding output is verified: SVM with result closest to zero will be elected; • END.

Application two (AP2): cicada density estimation by image processing
Visually, cicadas are quite noticeable in the farming environment, so a management hypothesis would be the inclusion of a camera to permit visual detection of pests, adjusting the data capture interval for sending to a web server.
The tests were performed by using images captured in the coffee crop and divided into three classes that represent the pest insidious density: high, medium, and none. The submitted images have been converted to grayscale and have a size of 320 Â 240 pixels, as in Figure 5.
Similar to AP1, an SVM is used to classify the preprocessed input energies; however, in this case we use the N normalized energies of each wavelet sub-band, instead of 25 Bark bands, according to the decomposition level defined in each test instance. In Figure 6, we present a flowchart that illustrates the process of capturing and processing the images.
In Figure 7, we have the SVM illustration with n ¼ 4 level entries, corresponding to the number of energies of the selected level.   The SVM structure used in application two. Similar to AP1, the weights determined during the supervised part of the training are fw 0 , w 1 , … , w XÀ1 g. The output element linearly combines the outputs of the hidden layer with the weights.

Tests and results
The implementation of the algorithms proposed here were performed using Java programming language.  Table 3.
Test results from AP2.

AP1
Thirty-five files from each class were collected and used. The tests were performed with those files not used for training, where two to five training files were experimented. For each set of training files, the DWT maximum level and mean decomposition level were tested with each of the 46 wavelet filters presented in Table 2. The best result, i.e., 96.88% accuracy, was obtained with Haar filter, showing excellent results and confirming our hypothesis of viability of using this system in coffee crop.
The cross-validation procedure was performed to present the best result in Table 2. The algorithm developed for cross-validation is illustrated in Figure 8.

AP2
To develop AP2, an application that uses digital image processing to estimate cicada density in a coffee crop was adopted, where 35 high-density class files, 35 low-density class files, and 35 files considered by an expert as zero density were used. In Tables 3 and 4, we present the results of the tests performed in the laboratory, which demonstrate the viability of future implantation in that system.

Conclusions
Both systems are being implemented in hardware for real-time coffee crop deployment using ESP8266 devices and their derivatives, integrating with cloud server for storing and organizing data to aid farmer decision-making. In future work, we must present practical results of their implementations.
Both AP1 and AP2 are systems that can be used as an additional form of coffee crop pest control and management. However, one possibility to be studied by the present research group is the combined use of the modalities in a single system in order to obtain even more improved results.
An important feature that was used for the laboratory experiments was the Java [14] Serialization class, which allowed for the wavelet transform in both AP1 and AP2 to be performed only once, storing its result on disk, making it possible to recover its value in repetitive cross-validation testing, substantially reducing equipment processing time.