Open access peer-reviewed chapter

Automatic Recognition of Tea Diseases Based on Deep Learning

Written By

Jing Chen and Junying Jia

Submitted: 25 November 2019 Reviewed: 02 March 2020 Published: 29 March 2020

DOI: 10.5772/intechopen.91953

Chapter metrics overview

1,091 Chapter Downloads

View Full Metrics


With the rapid development of intelligent agriculture and precision agriculture, computer image processing technology has been widely used to solve various problems in the agricultural field. In particular, the advantages of convolutional neural networks (CNNs) in image classification have also been widely used in the automatic recognition and classification of plant diseases. In this paper, a deep convolutional neural network named LeafNet capable of recognizing the seven types of diseases from tea leaf disease images was established, with an accuracy of up to 90.23%, aiming to provide timely and accurate diagnostic services in the remote and topographic tea plantation in China. At the same time, the traditional machine learning algorithm is applied for comparative analysis, which extracts the dense scale-invariant feature transform (DSIFT) of the image and constructs the bag of visual word (BOVW) model to express the image based on the DSIFT descriptor. The support vector machines (SVMs) and multilayer perceptron (MLP) were used to identify tea leaf diseases, with an accuracy of 60.91 and 70.94%, respectively.


  • tea leaf disease
  • deep learning
  • convolutional neural network
  • dense SIFT
  • bag of visual word

1. Introduction

Tea has a long history of cultivation in China, and the tea planting area and yield rank first in the world. According to statistical data, in 2016 China’s 17 provinces had a total of 2.87 million hectares of tea plantation and production, and the total output of tea reached 2.4 billion tons [1]. As the main tea-producing areas in China are mainly distributed in subtropical regions, the natural environment differs due to geographical latitude and topographical conditions. The tea tree is a perennial evergreen woody plant, which grows in warm and humid growth environment. However, these regions are conducive to the breeding and reproduction of diseases. In recent years, the tea planting area has increased year by year, and the tea leaf diseases have risen continuously, which has seriously threatened the quality and yield of tea. Because the distribution of tea areas in China is mostly in high mountain areas, the infrastructure construction in these areas is relatively lagging behind, and the occurrence of tea leaf diseases is often not controlled in a timely and effective manner, resulting in huge economic losses. Therefore, being able to detect and identify diseases early in the field is an important task to ensure the sustainable development of the tea industry.

The diagnosis of plant diseases is usually based on the appearance of the disease. When the leaves of a plant are infected by a disease, the appearance of the leaves will change significantly. Each disease usually has a discernible leaf color and texture symptom, and plant diseases can be diagnosed based on these characteristics. However, farmers mainly rely on their own experience to diagnose plant diseases with their own senses. Due to the limitation of knowledge background, there are ambiguities in the diagnosis. Most tea trees in China are planted in mountainous areas, which are large, difficult to investigate in the field, and inefficient. Relying on agricultural experts to diagnose tea leaf diseases is not only time-consuming but also costly. The transportation and infrastructure conditions in these places are limited. Finally, the expert must have experience and knowledge in various disciplines and need to understand all the symptoms of the disease and the causes of the diversity of the disease. At the same time, because China’s agricultural population is relatively large and the number of experts engaged in agricultural services is extremely limited, it is necessary to establish a system that can diagnose tea leaf diseases in a timely and accurate manner.

The current diagnostic methods of plant diseases mainly include microscope identification, molecular biology technology, and spectroscopic technology, but the first method is time-consuming and subjective. Even experienced plant pathologists may have wrong judgments, leading to inaccurate conclusion. The latter two methods are currently considered more accurate, and their main disadvantages are the high labor intensity and the requirement of specific instruments.

With the rapid development of intelligent agriculture and precision agriculture, machine learning methods and computer image processing technologies have been applied to the identification of plant diseases [2, 3], providing a new method for detecting plant diseases, which can help farmers and researchers quickly and accurately identify the types of plant diseases. The general approach based on machine learning and computer image processing technology is first to manually design and extract disease image features, namely, global features, such as color features [4], shape features [5], texture features [6], or two or more than three features [7, 8, 9, 10, 11], and local features, using scale-invariant feature transform (SIFT), speeded-up robust features (SURF), dense scale-invariant feature transform (dense SIFT), and pyramid histograms of visual words (PHOW) [12, 13, 14]. After extracting the features, they are identified and classified using different classifiers, such as artificial neural networks [15, 16] and support vector machines [17, 18]. Because traditional machine learning relies on features extracted manually, the resulting recognition system is not fully automated.

At present, most of the researches on tea using computer vision technology focus on tea quality detection [19], tea species identification [20], and tea leaf disease information query and management based on expert systems [21]. Because the expert system has limited knowledge and needs to be updated and maintained on a regular basis, it is also limited for noncomputer professional technicians. For some literatures, the identification of tea diseases is based on hyperspectral [22] or infrared thermal images [1]. These methods are easy to operate and have high accuracy, but the cost of the instrument is not suitable for widespread promotion.

In recent years, the popularity of the Internet has led to the explosive growth of Internet data, and the technical performance of computers and smartphones has continued to improve. These factors are the main reasons that have led to widespread attention for deep learning. Deep learning refers to the process of learning sample data through a certain training method to obtain a deep network structure containing multiple levels [23]. Deep learning is a branch of machine learning. Its essence is also a neural network, but the number of hidden layers is more than one layer, which is an extension of artificial neural networks. “Neural network” is a component of deep learning.

The concept of deep learning was first mentioned by Professor Geoffrey Hinton of the University of Toronto in a paper on back-propagation algorithms. The concept of “depth” was used to represent large artificial neural networks. With the introduction of deep learning, more and more researchers have begun to develop large-scale neural network systems. These deep neural network systems can take the characteristics from the original data, can work alone without human manipulation, and then can use what humans have learned to learn new things.

The advantage of the deep learning is that it does not require artificial feature extraction but this is obtained automatically by the network. It can solve nonlinear separable problems and has strong generalization ability and robustness. Among them, the most widely used is the convolutional neural network, which is a deep neural network. Images can be directly used as input data, eliminating the complicated process of feature extraction and data reconstruction in traditional machine learning algorithms. At the same time, the multilayer network structure of the convolutional neural network maintains a high degree of invariance to image translation, scaling, or lighting changes [18]. At present, convolutional neural networks have been applied to the identification and diagnosis of plant diseases [24, 25, 26].

In recent years, many researchers in the world have used machine learning algorithms to build many disease recognition systems, but because the characteristics of each plant disease are different, the different machine learning methods will have different recognition effects. Hence, based on previous studies, this paper uses deep convolutional neural networks to identify and classify tea leaf diseases. At the same time, the traditional machine learning algorithm is compared with the proposed convolutional neural network, and a recognition system suitable for the tea leaf disease is found through comparative analysis.


2. Date acquisition

The existing databases on the network such as ImageNet, PlantVillage, and CIFAR-1 datasets do not have sufficient tea leaf disease images and some studies have collected disease photos in indoor or controlled environments. These factors have made the recognition system designed to identify diseases under natural light conditions to have certain limitations, so a new disease data set is constructed in this paper.

Tea leaf disease images were all captured using the Canon PowerShot G12 camera in the natural light environment of the tea garden in Chibi and Yichang within Hubei Province. The images were taken about 20 cm directly above the leaves with autofocus mode at resolution of 4000 × 3000 pixels. A total of 3810 disease images were collected, which contained 7 diseases, and all disease images have been identified by plant pathologists. The identification criteria used for the tea leaf diseases were based on the previously described identification schemes [27, 28]. In order to meet the requirements of the model algorithm and reduce the computational complexity of the network, all disease images are resized to 256 × 256 pixels and 750 × 750 pixels, respectively. Figure 1 shows the types of tea leaf diseases used in this experiment. Data amplification processing is performed on a smaller number of disease images so that the number of the seven diseases image is balanced. Data amplification processing improves the generalization ability of the classifier, which is more conducive to network training. Three different methods were used to alter the image input and improve classification (Figure 2). A total of 7905 tea leaf disease images were obtained after the amplification treatment (Table 1). The 80/20 ratio of training/test data is the most commonly used ratio in neural network applications. In addition, a 10% subset of the test dataset was used to validate the dataset [29].

Figure 1.

Typical example images of tea leaf diseases used in this manuscript. (1) Red leaf spot (Phyllosticta theicola Petch). (2) Algal leaf spot (Cephaleuros virescens Kunze). (3) Bird’s-eye spot (Cercospora theae Bredde Haan). (4) Gray blight (Pestalotiopsis theae Steyaert). (5) White spot (Phyllosticta theaefolia Hara). (6) Anthracnose (Gloeosporium theae-sinensis Miyake). (7) Brown blight (Colletotrichum camelliae Massee).

Figure 2.

Examples of data augmentation used for red leaf spot images. (a) Initial; (b) flip horizontal; (c) flip vertical; (d) rotated 180°; (e–g) randomly cropped; (h) right-rotate 90°; (i) left-rotate 90°.

(1) White spot941118117
(2) Bird’s-eye spot955120119
(3) Red leaf spot890111111
(4) Gray blight893112111
(5) Anthracnose880110110
(6) Brown blight920115115
(7) Algal leaf spot846106105

Table 1.

Tea leaf disease dataset in this manuscript.


3. Tea leaf disease identification based on BOVW model

Traditional machine learning algorithm is a shallow architecture that contains one or two nonlinear transformation layers. It can automatically learn the underlying laws in the data and use the learned rules to make predictions. In the field of computer vision, many models can be realized by manually designing and extracting the visual characteristics of the image in advance, and the image content is converted into a quantitatively calculated information description form, after being processed by the shallow structure model.

3.1 Image visual feature

The extraction and selection of image visual features is an important means to transform the image content into a quantitatively calculated information description form, which mainly include global features and local features. Global features refer to the overall attributes of the entire image, mainly including color features, texture features, and shape features. These features are features that can be directly observed by the eyes. Global features are pixel-level shallow features with good stability, real-time performance, and simple and easy-to-implement algorithms. However, their shortcomings are high feature dimensions, large amount of calculations, and changes in image scale, lighting, and perspective. Local features are features extracted from local areas of the image, including corners, lines, edges, and areas with special attributes. Local features are distinguishable and robust to changes in lighting, rotation, perspective, and scale, as well as low dimensions and easy implementation.

The scale-invariant feature transform (SIFT) is local feature descriptor proposed by David G. Lowe in 1999 [30]. The SIFT descriptor maintains invariance to image rotation, translation, scaling, affine transformation, perspective and brightness changes, and noise and also maintains stability. And it can be combined with other algorithms to form a new optimization algorithm, thereby increasing the operation speed.

The traditional SIFT descriptor mainly extracts stable feature points in the image, which will lead to loss of some information in the image and long calculation time. And the number of feature points extracted from each image is different, which will inevitably lead to different dimensions. Lazebnik et al. improved the number and distribution of SIFT descriptors to obtain dense SIFT [31]. The main difference between the dense SIFT descriptor and the traditional SIFT descriptor is that the sampling method is different. The SIFT descriptor constructs a scale space to detect and filter feature points. The dense SIFT algorithm applies a fixed-size rectangular window for sampling from the left to the right of the image and from the top to the bottom according to the specified step size. The center of the window is used as a key point, and an image block composed of 16 pixels around the center is divided into 4 × 4 pixel-sized units. Within each pixel, the SIFT algorithm is used to calculate the gradient histogram in 8 directions and obtain 4 × 4 × 8 = 128 dimensional feature vectors to form a DSIFT descriptor. The feature points extracted by this method are uniformly distributed, and the specifications are the same; they maintain good stability to illumination, changes in perspective, and affine transformation, scaling, and rotation.

3.2 Bag of visual word-based feature representation

Bag of visual word (BOVW) model was mainly applied to text classification and retrieval technology. The core idea of the bag of visual word model is to treat text as a collection of different words, ignoring the word order, grammar, and syntax of the text, and these words are discrete and independent of each other or do not depend on the presence of other words. The frequency of each word in the text is counted and is represented with histogram so that each text is represented as a vector.

Due to the successful application of the BOVW model in text retrieval, Csurka et al. introduced the BOVW model to the field of computer vision [32]. Think of an image as a document and the features of the image (usually referred to as local features) as the words that make up the image. Unlike the words in the text, there are no ready-made words in the image. We need to extract independent features from the image, which are called visual word. Similar features can be regarded as a visual word. In this way, the image can be described as an unordered set of visual words (local features). Although local features (such as SIFT) also can describe an image, each SIFT is a 128-dimensional vector, and an image contains hundreds or thousands of SIFT descriptor. The calculation amount is very large, so these vectors are clustered, and the cluster center was used to represent a visual word.

The image classification using BOVW model mainly includes the following steps:

  1. Image feature extraction and description: Local feature vectors of the entire training set image are obtained through methods such as point-of-interest detection, dense sampling, or random sampling. Commonly used local features include SIFT descriptor and SURF descriptor.

  2. Construct a visual vocabulary: After obtaining the local feature vectors of all sample images, use the k-means algorithm to cluster the local feature vectors. The k-means algorithm is an unsupervised learning algorithm. It divides the data into different categories through an iterative process and then calculates the Euclidean distance between each data and various types of centers [33]. The smaller the distance, the higher the similarity. k represents the number of clusters, and means represents the mean of the data in the clusters. If there are k cluster centers (i.e., visual words), then the size of the visual vocabulary is also k. This manuscript selects 1000 visual words, and the size of the visual vocabulary is 1000.

  3. Representing images by word frequency: using the vocabulary as a standard, count the number of occurrences of each visual word in the image, and each image becomes a word frequency vector corresponding to the visual word sequence in the vocabulary, that is, each image is represented by a 1000-dimensional numerical vector.

  4. Select classifier to classify the 1000-dimensional numerical vector generated in the previous step as the input of the classification.

3.3 Classifiers

3.3.1 Support vector machines

Support vector machines (SVMs) were proposed by Corinna Cortes and Vapnik in 1995 [34]. It is a learning method based on VC statistical theory and structural risk minimization criteria. It has advantage in solving small sample, nonlinear, and high-dimensional pattern recognition problems. The basic idea of the SVMs is to map the low-dimensional space vector to the high-dimensional space through the nonlinear transformation defined by the inner product. In this high-dimensional space, the optimal classification hyperplane is determined according to the maximum geometric distance between the support vector and the classification plane. SVMs were initially used to classify two-class problems in the analysis of linear separable cases and require smaller sample sizes and an appropriate train rule, which have led to widespread use in image classification and recognition.

With the deepening of research on support vector machines, many scholars have carried out various toolkits in order to make them suitable for specific fields. In this manuscript a linear classifier LIBLINEAR designed by Professor Lin Zhiren of the National Taiwan University is used, mainly for processing large-scale data and features [35]. LIBLINEAR can be used in the following three cases: when the number of features is much larger than the number of samples; when the number of features and samples is large; and when the number of features is much smaller than the number of samples. Because the complexity of the linear classifier is lower than the nonlinear classifier, the training operation time is greatly reduced, and the training performance of the linear and nonlinear classifiers is also comparable under a large amount of data.

3.3.2 Multi-layer perceptron

The perceptron was proposed by Rosenblatt in 1958 [36]. It is an artificial neural network structure and the earliest feed-forward neural network. A single-layer perceptron contains only two layers, namely, the input layer and the output layer. Due to its limited mapping capability, it can only achieve linearly separable classification problems. A multi-layer perceptron has one or more hidden layers between the input layer and the output layer, which is mainly used for nonlinear classification and regression. The training algorithm is consistent with the traditional multilayer neural network and also uses a back-propagation algorithm.

Perceptron in this manuscript uses a three-layer structure. Because the extracted features are 1000-dimensional vectors, the input layer contains 1000 nodes, the hidden layer contains 100 nodes, and the output layer contains 7 nodes, which refer to the number of types of tea leaf disease.


4. Deep learning network construction

The network architecture designed in this manuscript was improved based on the classic model AlexNet model, named as LeafNet. The total number of parameters (weights and deviations) of the classic AlexNet network reaches more than 60 million, the parameters of the convolution layer comprises 3.8% of the total network parameters, and the parameters of the fully connected layer comprises 96.2% of the total. Therefore, by reducing the number of LeafNet’s convolutional layer filters and the number of fully connected layer nodes, the total number of network parameters is reduced, and the computational complexity is reduced. The recognition model has a relatively simple structure and a small amount of calculation, which effectively reduces the problem of overfitting.

4.1 Network structure

LeafNet consists of five convolutional layers, two fully connected layers, and a classification layer. The number of filters for the first, second, and fifth convolutional layers is half of those used in AlexNet’s filters. In addition, the number of neurons in the fully connected layer is set to 500, 100, and 7, respectively. The entire network structure is shown in in Table 2.

LayerParametersActivity function
Input227 × 227 × 3
Convolution1(Conv1)24 convolution filters (11 × 11) 4 strideReLU
Pooling1(Pool1)Max pooling (3 × 3) 2 stride
Convolution2(Conv2)64 convolution filters (5 × 5) 1 strideReLU
Pooling2(Pool2)Max pooling (3 × 3) 2 stride
Convolution3(Conv3)96 convolution filters (3 × 3) 1 strideReLU
Convolution4(Conv4)96 convolution filters (3 × 3) 1 strideReLU
Convolution5(Conv5)64 convolution filters (3 × 3) 1strideReLU
Pooling5(Pool5)Max pooling (3 × 3) 2 stride
Full Connect 6(fc6)500 nodes 1 strideReLU
Full Connect 7(fc7)100 nodes 1 strideReLU
Full Connect8(fc8)7 nodes 1 strideReLU
Output1 nodeSoftmax

Table 2.

Layer parameters for the LeafNet.

In this experiment, except for the last layer, the rectified linear unit (ReLU) activation function is selected instead of the traditional sigmoid and tanh functions. The main disadvantages of the sigmoid and tanh functions are the large amount of calculations, and when the input is large or small, the output is relatively smooth, the gradient is small, and it is not conducive to the weight update, which ultimately cause the network to fail to complete the training. ReLU is more in line with the principle of neuron signal excitation. It will make some neurons’ output 0, making the network sparse and reducing the interdependence of parameters, effectively alleviating overfitting. At the same time, ReLU has better transmission error characteristics and solves the problem of gradient disappearance, so it makes the training network converge faster.

After the nonlinear neuron output of the first two convolutional layers, a local response normalization operation is introduced. It is a normalization operation and mimics the lateral inhibition phenomenon of neurobiology. Local response normalization creates a competition mechanism for the output of local neurons. Local response normalization creates a competition mechanism for the output of local neurons, making the neurons with large responses larger, thereby enhancing the generalization ability of the model.

The first two fully connected layers have introduced the dropout operation. The dropout technique is an effective solution to overfitting via the training of only some of the randomly selected nodes rather than the entire network [37]. In this article, the dropout ratio is set to 0.5.

Softmax is the activation function of the last fully connected layer, which is mainly used in the output layer of multi-classification problems. It can make the sum of all output values equal to 1. That is, the output value of multiple classifications is converted into a relative probability, in which the category which has a high relative probability is the predicted value.

4.2 Training network

LeafNet’s training uses stochastic gradient descent (SGD) technique. The weight values of all convolutional layers and fully connected layers are initialized with a Gaussian distribution, and the bias is initialized with a constant of 1. This setting guarantees that the input of the ReLU activation function is a positive number and can also speed up the training speed of the network [25]. Because the number of samples is small, the batch size is set to 16. Batch training can improve the convergence speed of the network and keep the memory usage at a low level. The initial learning rate of all layers of the network is set to 0.1. The learning rate is reduced according to the decline of the error, and each time it is reduced to 0.1 times the original learning rate in subsequent iterations, with the minimum threshold of the learning rate set to 0.0001. The number of epochs was set as 100, while the weight of decay was set to 0.0005 and the momentum was set to 0.9 [38]. LeafNet is implemented using Matlab’s MatConvNet toolbox. The network training is performed on a Windows system, configured with a Core i7-3770K CPU, 8 GB of RAM, and accelerated training via two NVIDIA GeForce GTX 980 GPUs.


5. Performance measurements

As mentioned in [39], the classification accuracy and mean class accuracy (MCA) are used to evaluate the performance of the algorithm. CCRk is first defined as the correct classification rate for class k, as shown in Eq. (1):


Where Ck is the number of correctly identified for class k and Nk is the total number of elements in class k. Classification accuracy is then defined by Eq. (2):


Lastly, MCA is determined using Eq. (3):


6. Results and analysis

In this study, the accuracy of the SVM, MLP, and CNN classifiers in determining disease states for tea leaves from images was evaluated. The results of these analyses are shown in Figure 3. Error matrices were used to evaluate the accuracy of tea leaf disease recognition classifiers (Tables 35). From these data, although LeafNet algorithms are significantly better than SVM and MLP algorithms, three recognition algorithms can usually correctly identify most tea leaf diseases. Traditional machine learning algorithms extract the surface features of images, and the number is limited. The ability to represent image features is not strong, resulting in a low accuracy rate for identifying diseases. However, the CNN can automatically extract the deep features of the image, which can more accurately express the features of the disease image, so its recognition accuracy is higher.

Figure 3.

Accuracy (%) of disease classification for each of the three classification models in recognizing the seven candidate tea diseases.

White spotBird’s-eye spotRed leaf spotGray blightAnthracnoseBrown blightAlgal leaf spotSensitivityAccuracyMCA
White spot11130030094.87%90.23%90.16%
Bird’s-eye spot11170000198.32%
Red leaf spot0095708185.59%
Gray blight0049637186.49%
Brown blight01152097084.35%
Algal leaf spot1122109893.33%

Table 3.

Error matrix showing the classification accuracy of the LeafNet algorithm.

White spotBird’s-eye spotRed leaf spotGray blightAnthracnoseBrown blightAlgal leaf spotSensitivityAccuracyMCA
White spot791102191567.52%60.91%60.62%
Bird’s-eye spot128904110374.79%
Red leaf spot245923219253.15%
Gray blight001370817363.06%
Brown blight021917373163.48%
Algal leaf spot9101213345451.43%

Table 4.

Error matrix showing the classification accuracy of the SVM algorithm.

White spotBird’s-eye spotRed leaf spotGray blightAnthracnoseBrown blightAlgal leaf SpotSensitivityAccuracyMCA
White spot831303151270.94%70.94%70.77%
Bird’s-eye spot61000615184.03%
Red leaf spot118017011172.07%
Gray blight00981614172.97%
Brown blight051615375165.22%
Algal leaf spot65910446763.81%

Table 5.

Error matrix showing the classification accuracy of the MLP algorithm.

It can be seen from the error matrix that the recognition accuracy of MLP and SVM for the seven tea leaf diseases is 70.94% and 60.91%, respectively, and the MCA is 70.77% and 60.62%, respectively. In the two algorithms, the correct rate of the bird’s-eye spot is the highest, but there is no obvious regularity for the rest of diseases. The bird’s-eye spot is clearly distinguishable, characterized by small and dense red-brown dots, which are significantly different from other disease characteristics, so its accuracy of identification is high.

The recognition accuracy of tea leaf disease by SVM and MLP algorithm is not high, which is caused by artificial selection of features. The recognition effect of SVM and MLP algorithm largely depends on whether the artificially selected features are reasonable, and researchers usually rely on personal experience when selecting features. Although better results can be obtained using artificial feature classification, these features are specific to certain datasets. If you use the same features to analyze different data sets, the results may be very different, which is a problem inherent in these technologies.

LeafNet has the best recognition effect on the bird’s-eye spot, which may be due to the obvious plant pathological symptoms and the strong recognition ability of the LeafNet algorithm. The white spot disease was the second, while the other diseases range from 84 to 93%. Because of the similar pathological characteristics of the gray blight, red leaf spot, and brown blight, the classification accuracy of the three diseases is lowest. The symptoms of gray blight and brown blight diseases are too similar, which both exhibit annulations in their late stage and cannot be distinguished. In addition, the symptoms of white spot and bird’s-eye spot diseases both include reddish brown spots at early stages. In addition, both anthracnose and brown blight diseases are typified by waterlogged leaves during early disease stages, while symptoms are different in the later stages. Some diseases can occur in tea plants throughout the year, although some diseases occur at distinct times. Consequently, diseases diagnosed at different times may affect the accuracy of disease identification. Another factor that affects the accuracy may be that the tea leaf can be infected with two or more diseases at the same time. This is because when the tea leaf is infected by one pathogen, the leaves are suffering from physiological weakness, and the second pathogen can easily infect. Therefore, the above factors explain the main reasons for the low accuracy of the test model in some diseases.

In addition, the performance of LeafNet is compared with the method of Reference [40], which contains 10 diseases of 3 crops with a maximum accuracy of 97.3%. Therefore, the performance of LeafNet is slightly lower than Reference [40], which used currently popular transfer learning algorithm. The main advantages of this algorithm are as follows: the network can converge quickly when the data set is small; easy to implement; and shorter training time. Therefore, in the future we will continue to research on and apply transfer learning algorithms to identify more plant diseases.


7. Conclusion

CNNs have developed into mature techniques that have been increasingly applied in image recognition. The computational complexity needed for neural network analyses is considerably reduced compared to other algorithms, and it also significantly improves computing precision. Concomitantly, the high fault tolerance of CNNs allows the use of incomplete or fuzzy background images, thereby effectively enhancing the precision of image recognition.

Feature extraction is an important step in image classification and directly affects classification accuracies. Thus, two feature extraction methods and three classifiers were compared in their abilities to identify seven tea leaf diseases in the present manuscript. These analyses revealed that LeafNet yielded the highest accuracies among SVM and MLP classification algorithms. CNNs thus have obvious advantages for identifying tea leaf diseases. Importantly, the results from the present study highlight the feasibility of applying CNNs in the identification of tea leaf diseases, which would significantly improve disease recognition for tea plant agriculture. Although the disease classification accuracy of the LeafNet was not 100%, improvements upon the present method can be implemented in future studies to improve the method and provide more efficient and accurate guidance for the control of tea leaf diseases.

In this manuscript, the expansion process of sample data is a time-consuming process, but with the continuous growth of network information resources, the number of tea tree disease images will continue to increase, so we must collect images of different morphological features in the early, middle, and late stages of each disease and continuously expand the tea tree disease data set to make the data set more detailed and comprehensive.

At present, disease recognition is based on computer system operations. However, as the performance of smartphones continues to improve, the recognition model of deep convolutional neural networks is migrated to android-based mobile applications. It can timely and accurately obtain relevant information about diseases and can provide help for the control of tea tree diseases.



We acknowledge the funding support by key R&D projects of Ningxia Hui Autonomous Region (2017BY080) and the National Natural Science Foundation of China (M1942001) and Natural Science Foundation of Inner Mongolia Autonomous Region (2019MS08168).


Conflict of interest

The authors declare no conflict of interest.


  1. 1. Yang N, Yuan M, Wang P, Zhang R, Sun J, Mao H. Tea diseases detection based on fast infrared thermal image processing technology. Journal of the Science of Food and Agriculture. 2019;99(7):3459-3466. DOI: 10.1002/jsfa.9564
  2. 2. Rumpf T, Mahlein AK, Steiner U, Oerke EC, Dehne HW, Plümer L. Early detection and classification of plant diseases with support vector machines based on hyperspectral reflectance. Computers and Electronics in Agriculture. 2010;74(1):91-99. DOI: 10.1016/j.compag.2010.06.009
  3. 3. Sladojevic S, Arsenovic M, Anderla A, Culibrk D, Stefanovic D. Deep neural networks based recognition of plant diseases by leaf image classification. Computational Intelligence and Neuroscience. 2016;6:1-11. DOI: 10.1155/2016/3289801
  4. 4. Chaudhary P, Chaudhari AK, Cheeran AN. Color transform based approach for disease spot detection on plant leaf. International Journal of Computer Science and Telecommunications. 2012;3(6):65-70 DOI:
  5. 5. Chung CL, Huang KJ, Chen SY, Lai M, Chen Y, Kuo Y. Detecting bakanae disease in rice seedlings by machine vision. Computers and Electronics in Agriculture. 2016;121:404-411. DOI: 10.1016/j.compag.2016.01.008
  6. 6. Hossain E, Hossain MF, Rahaman MA. A color and texture based approach for the detection and classification of plant leaf disease using KNN classifier. In: 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE). IEEE. Vol. 2019. 2019. pp. 1-6. DOI: 10.1109/ecace.2019.8679247
  7. 7. Shrivastava S, Singh SK, Hooda DS. Soybean plant foliar disease detection using image retrieval approaches. Multimedia Tools and Applications. 2017;76(24):26647-26674. DOI: 10.1007/s11042-016-4191-7
  8. 8. Pydipati R, Burks TF, Lee WS. Identification of citrus disease using color texture features and discriminant analysis. Computers and Electronics in Agriculture. 2006;52(1–2):49-59. DOI: 10.1016/j.compag.2006.01.004
  9. 9. Zhang S, Wu X, You Z, Zhang L. Leaf image based cucumber disease recognition using sparse representation classification. Computers and Electronics in Agriculture. 2017;134:135-141. DOI: 10.1016/j.compag.2017.01.014
  10. 10. Diao ZH, Song YM, Wang YP, et al. Feature extraction of leaf images for mite disease in cotton fields. Advanced Materials Research. 2013;605:919-922. DOI: 10.4028/
  11. 11. Ali H, Lali MI, Nawaz MZ, et al. Symptom based automated detection of citrus diseases using color histogram and textural descriptors. Computers and Electronics in agriculture. 2017;138:92-104. DOI: 10.1016/j.compag.2017.04.008
  12. 12. Pires RDL, Goncalves DN, Orue JPM. Local descriptors for soybean disease recognition. Computers and Electronics in Agriculture. 2016;125:48-55. DOI: 10.1016/j.compag.2016.04.032
  13. 13. Zhang S, Zhu Y, You Z, Wu X. Fusion of super pixel, expectation maximization and PHOG for recognizing cucumber diseases. Computers and Electronics in Agriculture. 2017;140:338-347. DOI: 10.1016/j.compag.2017.06.016
  14. 14. Zhang J, Marszalek M, Lazebnik S, Schmid C. Local features and kernels for classification of texture and object categories: A comprehensive study. International Journal of Computer Vision. 2007;73(2):213-238. DOI: 10.1109/cvprw.2006.121
  15. 15. Wang H, Li G, Ma ZH, Li X. Image recognition of plant diseases based on principal component analysis and neural networks. In: Proceedings of the 2012 8th International Conference on Natural Computation. Chongqing, China, 29–31 May. 2012. pp. 246-251. DOI: 10.1109/icnc.2012.6234701
  16. 16. Karmokar BC, Ullah MS, Siddiquee MK. Tea leaf diseases recognition using neural network ensemble. International Journal of Computer Applications. 2015;114(17):1-9. DOI: 10.5120/20071-1993
  17. 17. Yao Q, Guan Z, Zhou Y, Tang J, Hu Y, Yang B. Application of support vector machine for detecting rice diseases using shape and color texture features. In: Proceedings of the International Conference on Engineering Computation. Hong Kong, China, 2–3 May. 2009. pp. 79-83. DOI: 10.1109/icec.2009.73
  18. 18. Hossain MS, Mou RM, Hasan MM, et al. Recognition and detection of tea leaf’s diseases using support vector machine. In: 2018 IEEE 14th International Colloquium on Signal Processing & Its Applications (CSPA). IEEE. 2018. pp. 150-154. DOI: 10.1109/cspa.2018.8368703
  19. 19. Xu M, Wang J, Gu S. Rapid identification of tea quality by E-nose and computer vision combining with a synergetic data fusion strategy. Journal of Food Engineering. 2019;241:10-17. DOI: 10.1016/j.jfoodeng.2018.07.020
  20. 20. Zhang Y, Yang X, Cattani C, Rao R, Wang S, Phillips P. Tea category identification using a novel fractional Fourier Entropy and java algorithm. Entropy. 2016;18(3):77. DOI: 10.3390/e18030077
  21. 21. Xu Y, Mei H, Lin L, Shi X, Zhou H. The study and exploitation of diagnosed and controled of tea disease’s expert system. System Sciences and Comprehensive Studies in Agriculture. 2003;19(2):93-96. DOI: 10.3969/j.issn.1001-0068.2003.02.004
  22. 22. Zhang S, Wang Z, Zou X, et al. Recognition of tea disease spot based on hyperspectral image and genetic optimization neural network. Transactions of the Chinese Society of Agricultural Engineering. 2017;33(22):200-207. DOI: 10.11975/j.issn.1002-6819.2017.22.026
  23. 23. Yosinski J, Clune J, Bengio Y. How transferable are features in deep neural networks? Advances in Neural Information Processing Systems. 2014;2014:3320-3328. Available from:
  24. 24. Ouppaphan P. Corn disease identification from leaf images using convolutional neural networks. In: 21st International Computer Science and Engineering Conference. 2017. pp. 233-238. DOI: 10.1109/icsec.2017.8443919
  25. 25. Liu B, Zhang Y, He D, Li Y. Identification of apple leaf diseases based on deep convolutional neural networks. Symmetry-Basel. 2018;10(1):11. DOI: 10.3390/sym10010011
  26. 26. Zhang X, Qiao Y, Meng F, Fan C, Zhang M. Identification of maize leaf diseases using improved deep convolutional neural networks. IEEE Access. 2018;6:30370-30377. DOI: 10.1109/access.2018.2844405
  27. 27. Lehmann-Danzinger H. Diseases and pests of tea: Overview and possibilities of integrated pest and disease management. Journal of Agriculture in the Tropics and Subtropics. 2000;101(1):13-38. Available from:
  28. 28. Keith L, Ko WH, Sato DM. Identification guide for diseases of tea (Camellia sinensis). Plant Disease. 2006:1-4. Available from:
  29. 29. Chen J, Liu Q, Gao L. Visual tea leaf disease recognition using a convolutional neural network model. Symmetry. 2019;11(3):343. DOI: 10.3390/sym11030343
  30. 30. Lowe DG. Object recognition from local scale-invariant features. IEEE. 1999:1150-1157. DOI: 10.1109/iccv.1999.790410
  31. 31. Lazebnik S, Schmid C, Ponce J. Beyond bags of features spatial pyramid matching for recognizing natural scene categorie. IEEE. 2006:2169-2178. DOI: 10.1109/cvpr.2006.68
  32. 32. Csurka G, Dance C, Fan L, et al. Visual categorization with bags of keypoints. In: International Workshop on Statistical Learning in Computer Vision (Prague). 2004. pp. 1-22. Available from:∼efros/courses/AP06/Papers/csurka-eccv-04.pdf
  33. 33. Hartigan JA, Wong MA. A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics). 1979;28(1):100-108
  34. 34. Vapnik V. Statistical Learning Theory. New York, NY, USA: John Wiley and Sons; 1998. DOI: 10.1002/9780470140529.ch4
  35. 35. Fan R-E, Chang K-W, Hsieh C-J. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research. 2008;9(9):1871-1874. Available from:
  36. 36. Rosenblatt F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review. 1958;65:386-408. DOI: 10.1037/h0042519
  37. 37. Kim P. MATLAB Deep Learning with Machine Learning, Neural Networks and Artificial Intelligence. Berkeley, CA, USA: Apress; 2017. pp. 114-116. DOI: 10.1007/978-1-4842-2845-61
  38. 38. Sutskever I, Martens J, Dahl G, et al. On the importance of initialization and momentum in deep learning. International Conference on Machine Learning. 2013;28:1139-1147. Available from:
  39. 39. Benammar EA, Cascio D, Bruno S, Ciaccio MC, Cipolla M, Fauci A, et al. Computer-assisted classification patterns in autoimmune diagnostics: The AIDA project. BioMed Research International. 2016;2016:1-9. DOI: 10.1155/2016/2073076
  40. 40. Rangarajan Aravind K, Raja P. Automated disease classification in (selected) agricultural crops using transfer learning. Automatika. 2020;61(2):260-272. DOI: 10.1080/00051144.2020.1728911

Written By

Jing Chen and Junying Jia

Submitted: 25 November 2019 Reviewed: 02 March 2020 Published: 29 March 2020